For the past six months, during my spare time, I have been maintaining the software of an online marketplace. The marketplace provides financial, legal, technical infrastructure, buyer-seller conflict resolution and marketing. It let's the sellers focus on their products, shielding them from infrastructure and payment complexities. This marketplace was founded by a non-technical individual and its initial version was developed by a software agency in Türkiye using PHP.
The timeline of the project before I got involved:
- The website domain name was registered on 22 December 2020.
- Frontend design with Figma was done by the end of 2021.
- Web application development started by the end of 2021.
- Judging from the oldest product addition date to the database, the site went live on 29 April 2022.
- Approximate initial design and development cost was around 7K$ and it took 6 months to launch the site. Currently, site hosting, server maintenance, minimal software development assistance and company taxes cost around 260$/month (excluding marketing expenses)
The website is primarily accessed via mobile phones:
Before undertaking this work, I mainly developed desktop applications and hardware-in-the-loop systems using Java and C++ for the past 12 years. I had zero experience with PHP and databases. My web application development knowledge was limited to a few hobby projects. The motivation was to enhance web programming skills with a live website (real requirements) and be of help to others. I faced the following challenges after I took over the code base:
- The software agency was not very responsive. I was informed that, at the beginning of the project, they responded quickly to change requests. However, as time progressed, their response times slowed down, eventually coming to a halt after a year (that's when I joined). This pattern is typical in software projects because most of them are not regularly cleaned up. As software grows, it accumulates 'junk' or redundant and outdated code with lots of implicit dependencies. This accumulation makes modifications that were initially easy to implement increasingly difficult, and sometimes nearly impossible. Besides the code being difficult to understand, even a simple change might break some seemingly unrelated part.
- Another reason for their unresponsiveness was that they had new projects with higher priority. Video with similar message: Never outsource your tech startup MVP
- The software agency's goal of high customizability (because they were using the code in other projects too) and not using an existing framework like Laravel has resulted in enormously complex code. Most of the customization features were never used. It was impossible to handle special cases.
- There were many features that worked some of the time but broke when there was the slightest variation.
- The web app was not scalable at all. As more products were added by sellers (which is a good thing), page load times increased from a couple of seconds to over 13s and it was getting worse by the day. One reason was that a product page was also downloading similar products which were more than a hundred for any given product! Another reason was inefficient database queries.
- Due to complicated routing structure, login status had to be checked in every method that was modifying the database... and checks were forgotten in a lot of such methods, making the site vulnerable to bad actors.
- All files and folders were under public_html, and .htaccess file was not configured to prevent access to php files. This means that anyone can run any php file if they can guess the name of the file! The first thing to do in terms of cybersecurity is to separate the public and private portions of your web application or at least configure .htaccess to prevent php file access.
- No 2 factor authentication on web server management panel (CPanel) login. If someone can get the user name and password, they can delete or encrypt the entire site for ransom. Since backups are also saved on the same CPanel, everything can be lost in an instant. For a cautionary tale, see [PDF] British Library cyber incident review.
- Passwords were hashed with MD5 despite PHP having a much stronger password_hash function. Note that MD5 is unsuitable for password storage due to its fast processing speed, which makes it vulnerable to brute-force and rainbow table attacks.
- Server/database updates were done at prime time (e.g. 21:30) instead of late at night, which was causing loss of data like losing a product comment that was written around that time.
- API keys were embedded into code instead of reading them from the database or from environment variables.
- There was no documentation, only raw source code without any useful comments.
- There was no version control, no issue tracking system. There is a demo folder on the server. After few tests on demo, files are manually copied to live folder.
- Copying an older project - more than 5200 files excluding image files - led to over 90% of the code being irrelevant for this project, cluttering search results when querying the codebase. For instance, there is a database table called cities that includes every city worldwide, amounting to over 147K entries! For the foreseeable future, we need less than 100 cities which means a bloat factor of 1500X! As if that is not confusing enough, there is also a states table which has some overlap with cities table. Some code uses cities table and other code uses states table.
- There were lots and lots of copy/paste code (anti-DRY) instead of wrapping common code inside static functions and using those. When you find an error in one place, you have to correct it in 10 other places.
- Limited PHP knowledge resulted in not using idiomatic PHP and reinventing the wheel, e.g. creating their own transliteration to ASCII array instead of using iconv function. This increased lines of code with lesser quality.
- Similar functionality is in different files, e.g. library/image.php, tool/image.php.
- The same concept is represented by different names, e.g. in one place it is called 'coupon', in another it is called 'promotion_code'.
- One of the frequent use cases is displaying a list, editing or deleting from list or adding to list. For editing and adding, a new form is displayed. All these cases are managed inside a single index() method of the relevant controller, with lots of if statements.
- There were multiple functions that seemed to do the same thing but actually only one was active and the others were not used, they were leftovers from the past. For instance, I spent half a day modifying the user registration code, only to discover later that I was working on the obsolete one, wondering why my changes were not reflected to the registration process!
- The biggest sin in terms of productivity was not using any IDE. I couldn't believe anyone could work that way in 2023! This means no static analysis feedback for errors like querying non existing database tables and no warnings for bad practices like using non-traceable magic methods. No feedback for unused methods littering the source code. No ctrl+click navigation to method definitions, no automatic refactoring. Everything had to be done with ctrl + f text search which made refactoring extremely tedious.
- Another serious side effect of not using an IDE was not being able to get the benefit of autocomplete for methods existing in a class (or one of its ancestors), which in turn led to using primitive but complicated ways of extracting data from an object. Example code below shows data processing (red) instead of just using the relevant object method (green):
- Magic methods were used extensively, which meant that it wasn't clear whether a method existed in a class or not. One had to perform text search of the all the files to verify. There were many cases were the method did not exist or input parameters were wrong! The only reason it did not fail is due to that code path not being triggered yet.
- CSS bloat (109 CSS files!) made it difficult to find the root cause of a style not behaving as expected. Examples: Not being able to increase the width of a textarea despite setting width: 100%. Adding a required attribute to a checkbox does prevent form submission on button press but it does not show standard error message when checkbox is not selected.
- PHP version was 7.3, which masked lots of errors. For example, a typo in a statement that sets a property could go unnoticed because PHP silently allows all dynamic properties.
- Instead of solving warnings shown even by PHP 7.3, like "Undefined array key", warnings were muted with a custom error handler.
- Instead of using namespaces to avoid class name collisions, long class names like ModelCommonConfig were used.
- For most database tables, the older MyISAM (no foreign keys support) storage engine was used instead of the newer InnoDB (used as default in MySQL since 2010).
- For many-to-many relationships, serialized strings were used to store multiple values in a single MySQL column instead of pivot tables, decreasing efficiency due to need of deserialization and the database engine not being able to optimize searches. It also made the code difficult to understand.
- Insufficient logging: There are only CPanel's error_log and connection logs using a simple text file instead of using an existing library like log4php.
- There were no automated tests. Whenever you update the code, you have to think of a good set of test cases and carry them out manually which results in much less testing than would be necessary.
The sole reason these issues haven't led to a crisis yet is the website's relatively low traffic volume. Presently, the founder is seeking investment. Post-investment, intensive marketing is planned, which is expected to result in significant traffic spikes.
I didn't have the luxury of re-writing the site from scratch because it was live with paying users and urgent fixes were needed. Since there was no requirements/design document to guide me and my web skills were limited, a full rewrite would have taken months in the best of conditions. I focused on incremental improvements and documentation. My technical actions in the past 6 months:
- Rather than using the demo folder on the server, I configured the code to run on my local machine and tested it there. This approach was significantly faster than repeatedly updating files on the server for testing every time I made a change.
- Created a private GitHub repo and used it for issue tracking too. I still have to manually copy my changes to the server after I verify them locally.
- Started using the PHPStorm IDE and correcting errors and warnings pointed out by its static analyzer.
- Wrote documentation, detailing general workflow and how to install development environment locally. I employed interns who found it challenging to understand the project, which underscored the mostly unnecessary complexity of the application. However, my documentation facilitated their onboarding process.
- Started doing weekly manual backups of database SQL file and image folder to my local computer so that when the server is hacked or some corruption occurs, the site can be restored from my local backup, with a data loss of at most 1 week. Of course, this is a short term solution, an automated, more frequent and cloud based solution has to be implemented.
Plans for 2024:
- Delete unused database tables and related source code.
- Get rid of all magic method calls.
- Get rid of all unused code, probably more than 90% of existing code will be deleted.
- Separate index() methods into separate POST handling methods.
- Move duplicate code into static functions, which will decrease the code by another 90%. In the end, I expect the code to be 1% of its original size.
- Remove rarely used features from admin and vendor panels.
- Improve documentation.
- Get rid of CSS bloat.
- Get rid of unnecessary components like CKFinder.
- Add analytics to the admin panel for metrics that cannot be collected by Google Analytics, like cities of vendors, cities to which products were shipped.
- Add 2 factor authentication to CPanel login.
- Automate incremental backup to first local PC, then the cloud.
- Simplify the page rendering sequence of operations. Convert pages that rarely change into static ones, so that they don't need to load content from the database.
- To prove ease of migration, verify that you can host website on your own hobby server.
- Switch to PHP 8+ to get more feedback / support from the language instead of silently masking problems as was the case with PHP 7.4.
- Use mature framework like Laravel or aimeos.
- Switch to API-centric architecture for communication with planned mobile app. Build the initial version of mobile app for iOS and Android. Use React to have single code base for web app and planned mobile app.
Plans for 2025+:
- Use test automation, CI/CD (dev-test-live environments) for verifying changes in GitHub and deploying to live server if all tests pass, without any manual file copying. Unit tests are particularly important for an interpreted language such as PHP, where the absence of a compiler, unlike in compiled languages like Java or C++, means that even simple syntax errors are only caught during testing, or worse, in production.
My advice for people who want to start an online marketplace:
- For a quicker time-to-market, use a platform like Sharetribe. You can launch your initial version in just one day, without any coding, rather than spending several months! Once you passed the idea validation and MVP phases with the free plan (0$) and you live in a country were Swipe or PayPal is not available (e.g. Türkiye), to enable money transfers, you have to get help from a software freelancer to write code for payment gateway (e.g. iyzico) integration and host the custom code on your own server. Note that either you have to understand the basics of server maintenance or you have to pay an engineer's retainer. Your costs would be a one time fee for user interface / graphic design, payment gateway integration and server setup, 300$/month for Sharetribe yearly subscription, 20$/month for server hosting (assuming you do the server maintenance). You would also need to budget for insurance and taxes (~200$ for Türkiye) and marketing (>100$/month) because word of mouth won't probably be enough. For the first year, you need at most 8K$. Assuming your commission rate is 20%, if you can generate approximately 40K$ a year in revenue, you would be at the break-even point and potentially become attractive for outside investment (if you have collected metrics). When you reach this milestone, which might take a couple of years, you might consider quitting your day job and dedicate yourself full time to your marketplace business.
- For Türkiye, you might also consider using WordPress hosting, which has a regular price of 20$/mo, 6X cheaper than VPS hosting with CPanel for 120$/mo regular price. You can use WooCommerce and iyzico payment plugins for payments in Türkiye to get started. If you want custom code, you have to write them as WordPress plugins. If you can install your own web server and WordPress without any admin panel, i.e. self-host your web app, you could get away with 5$/mo for VPS and 20$/year for domain name. Just ask chatGPT: "Steps to self host my e-commerce web app with https and wordpress on a linux VPS, buy a domain name, and point it to my VPS IP"
- If you choose not to use an existing platform and prefer custom development, be prepared for at least a three-month commitment and significant financial investment. If you're not familiar with software development, find a trusted individual with relevant experience to assist you. This person may not write the code themselves, but they can help manage the development process. When selecting a software agency, if they don't use an IDE, don't even consider working with them. If they do, ensure they also use standard frameworks (e.g. Laravel for PHP. See Udemy marketplace build course, aimeos open source), rather than their own. In fact, you can use this blog as a software agency maturity evaluation checklist. Otherwise you will end up with an unmaintainable mess and all your efforts up to that point may go to waste.
Music: Karsu - Jest Oldu
No comments:
Post a Comment