Two Things to Consider When Deciding Between Web Scraping Software or Data as a Service (DaaS)
Continuing on my previous post, “ Programmer, Data as a Service (DaaS), Software, or Licensing… what is right for my Web Data product? “ (http://tinyurl.com/qavxcqw), Let’s explore what I believe are two main components in the decision process when choosing between DaaS and a Platform.
Is it your core? – Will this be part of your core product? If the service goes down or if the quality starts to suffer, will it hurt your credibility with your customers or, worse yet, your pocketbook? Do you have punitive SLAs between yourself and your vendors and expect the same to be in place upstream from your data provider? If the answer to all these is “Yes,” then bring the software in-house. I have never been a fan of outsourcing key components of your vertical so owning the entire pipe, source —> post-process cleanup —> consumption, will be essential in ensuring you are providing the quality service your customer(s) expects.
There is a valid argument against doing so which usually runs along the lines of ”you will never be able to use the software as well as we can, we are the experts…”. I completely agree with it, but I propose that you look at a hybrid relationship, one where the software vendor provides the professional services in support of their platform which will be installed in your Enterprise. This can be done in parallel with bringing your own team up to speed, if eventually you want to own the proprietary building components, or you can explore contracting FTEs from the vendor or their partnership network. It will cost a little more than buying the software outright, but you get an increased level of ownership from your vendor while still coming in a little bit cheaper than an all-out DaaS model and affording yourself the comfort of owning your entire data acquisition workflowl.
Remember: Don’t outsource your vertical! It will bite, and its not a matter of if, but when.
What’s it going to Cost? – “If cost was not an issue…” we’d all be cruising around on 200′ mega yachts sipping Dom. But seriously, cost will always be a critical component. No matter how it’s sliced, the product or the value built off of the extracted data will come from whatever the post-process (after extraction) secret sauce is. This is true no matter if the product is for the street or internal consumption. If you find yourself confused by this, I suggest taking a hard look at your value prop because, as served, web data is commodity data with very little inherent value of its own.
But back on topic, buying a platform will normally come at a discount compared to a DaaS offering. This logic model holds when dealing with the top tier vendors and starts to become paradoxical as you move further down the ladder to less sophisticated vendors (platforms and BPOs that provide something akin to a DaaS offering without the maturity of Enterprise software). Where the product is at –be it a prototype, alpha, or general release with paying customers – should help guide the decision. Don’t buy a platform when what’s needed is a few thousands rows of data to help frame out your prototype; there are plenty of vendors out there who can do that for you on the cheap (sub-$500). Conversely, don’t rely on a BPO who leverages scripts when what’s needed is a reliable solution with high uptime and good quality data. This will end up costing a fair sum of money and I am almost positive it will never end well.
If it sounds too good to be true, then more than likely it is.
Series Note: This is the third post in a series of publications on web data extraction (#webscraping). The goal of this series is to help shed some light on the full life cycle of web data extraction. If you have any specific questions, feel free to contact me at tom@frigginyeah.comor visit my website: https://frigginyeah.com