TomsTechBlog.com

It's hard to say these days

Open Source At All Costs

clock June 3, 2008 12:44 by author Tom

A couple days ago I posted on a TechCrunch post that "called out" Twitter on their architecture.  Twitter responded here with a pretty standard "things are screwed up but we're trying real hard to fix it" post.  I don't have a problem with that, I think its their best course of action at this point.

But one response did bug me.  Here's the quote...

Q: Is it true that you only have a single master MySQL server running replication to two slaves, and the architecture doesn’t auto-switch to a hot backup when the master goes down?
A: We currently use one database for writes with multiple slaves for read queries. As many know, replication of MySQL is no easy task, so we've brought in MySQL experts to help us with that immediately. We've also ordered new machines and failover infrastructure to handle emergencies.

Honestly, this is one of the things that really bothers me about the startup community.  There's this idea of "we use Open Source even if it's not up to the job." 

The tasks that Twitter has "brought in MySQL experts to help with" are tasks that one person could easily do with a MS SQL or an Oracle database.  Even someone with very little expertise could get through it with a $40 book or a $245 support call.  Say what you will about MySQL being free there's just no way it's cheaper in the long run. 

Not only that, feature for feature even the Standard Edition of MS SQL Server (Retail Price $6,000) handily beats MySQL.  I'm all for using Open Source Software but not if it means creating inferior technology. 

Here's the thing.  There are a lot of technology tasks that Enterprises do badly.  But the one thing they do well is to keep things running.  My cynical nature suggests part of the reason for that is because a "non-techie" boss won't realize if your system is badly designed but will realize if it isn't up.  Whatever the case, Enterprise customers look for solutions that don't go down. 

So while the Web 2.0 world is better off blazing its own path in aspects of social design and user interaction they can learn a lot by emulating Enterprise in areas of IT infrastructure.

For the record...Though it has nothing to do with the post I feel the need to give a quick thanks to Microsoft.  As a non-profit Microsoft gives us very generous terms (95% discounts in most cases).  So when I was faced with this decision it was a lot easier to pick the more capable system which in turn made it a lot easier to focus on solving actual problems rather than administering databases.


SimpleDB Follow Up

clock December 16, 2007 20:35 by author Tom

A little follow up on the SimpleDB discussion and the various perspectives that have come out of the web since it was introduced.

First Marcelo Calbucci makes a good point which is that SimpleDB is not in fact a database it is a directory service.  As someone who maintains an Active Directory based network I can tell you there are some significant differences in the concepts.  That said, I'll let Marcelo tell you himself...

First of, each object (this is what a "record" is called on a Directory Service) can contain different attributes and the schema can be changed on the fly (a bit more complicated than that).

    The next interesting aspect is that a single attribute (field) can have multiple values, just like the Amazon SimpleDB! This means if I define attribute "UsedBy" I can set the values to "Realtors" and "Brokers". On traditional relational databases you'd need 3 tables to do something like this.

    Finally, a Directory Service allows a hierarchy of objects, meaning instead of Tables you have nodes (which are container objects) and objects hang out of those nodes. Oh oh, SimpleDB doesn't have that, so all my theory goes down the drain.... Not really, they provide a thing called "Domain" which, if you want to (but you don't), can be used as a hierarchy.

I think this seems like a bigger distinction than it actually is in that most applications will interface with the system in the same way they would with a database.  Why I think this is such an important point though is because startups might be thinking they can sign on to SimpleDB and then just move over to a SQL Server when they are ready (I made the same point in my previous post).  That's true if you don't use some of the directory specific features but you'll need to be mindful in every aspect of your application design if you want to keep that avenue open. 

Building on SimpleDB without using the Directory specific features is going to be difficult given you also don't have things like Table Joins available to you. 

That said I think some people are making too big a deal out of this problem.  To quote Jonathan Boutelle...

Now we know. This isn't a vanilla mysql clustering service: it's something a little weirder (it's conceptually similar to a database, but lacks many of the features of a database, and works somewhat differently). As a result, you'll have to build your app from the ground up as an Amazon app: this isn't a drop-in replacement for mysql cluster.

I guess it depends on what type of application builder you are in the first place but in most cases you won't have to build things from the ground up.  This type of situation is really where the beauty of tiered design comes in.  I've not had time to study SimpleDB in detail but off the top of my head it seems like you could reproduce the concept of a database by splitting your data tier in two and having the upper level mimic database functions.  I know in .Net it would be as simple as creating a dataset and making your requisite joins in there (which isn't to say that building a solution like that would be simple just that its conceptually a simple idea to understand)

Which leads me to my last quote from Justin Rudd.  Justin says in his post that he doesn't think using SimpleDB is a good idea for startups because you are putting the responsibility for scaling completely on Amazon and if your start up gets bigger than they can support you'll end up shooting yourself in the foot.  He goes on to say...

Amazon should be part of your backup, not your front line.  Now as with all things, this is shades of gray (there is no black or white in anything in life).  If it is just you (and maybe a buddy), by all means use the Amazon Web Services to get a proof of concept up.  Get an AMAZING front end out that engages customers and keeps them coming back.  Have a couple of small web servers that you can host yourself, and let Amazon deal with all the meaty problems until you start making money.  But once you’ve proven that you’ve got a good idea, start looking at what pieces need to be brought into your own data center to guarantee a good customer experience.

Well again it comes down to the basic concept that a tiered application should have no idea where its data is coming from except to say "from the data layer".  So no complete rebuilds should be required.  You should be able to, with very little effort, transition to your own servers when it becomes cost effective to do so.  As low as Amazon's prices are they are still quite a bit higher than maintaining your own servers if you can afford to support a full time position to do that. 

That said anyone who claims you should have your own servers from the second you start making money needs to look into the time and effort it takes to maintain a database cluster (sorry Justin).  Administrating a SQL cluster is hard enough as it is and that in itself generally requires a full time person (don't expect your developers to be able to do it).  Moreover if you are a serious startup that job is going to require several full time people because you have to have someone watching it at all times.  What happens if it goes down at 1am?  The Internet is worldwide so that's the middle of the afternoon for some countries, do you really want to have 3 hour outages while your DBA wakes up, gets dressed and runs down to find out what has gone wrong?  Amazon provides a whole infrastructure for chump change that comes with built in piece of mind.  That is worth a lot to a startup even if they have started to make a little money and I don't see why they wouldn't take advantage of that in their infancy.  No one is saying a company like Facebook should be trying to run off SimpleDB.

On that note I do think companies such as Microsoft who have existing database products should work at creating a service like this that would have an upgrade path to their brand of full blown server products.  If Microsoft were to have a "SQL Server Cloud" product that allowed for easy export to a full SQL Server as a startup grows I think they might have a chance at capturing a big part of the startup market (which at this point avoids them like the plague). 



About Me

Not really relevant right now. This blog is on hiatus. I really haven't decided if it is an indefinite hiatus yet

For the record if you've tried to e-mail me over the last 4 to 6 months I didn't mean to ignore you. The e-mail forwarding isn't working and I didn't realize that until months worth of e-mails had been deleted on forward. The tom@tomstechblog.com address still won't forward to the postmaster account and I don't know why because it's provided by the webhost. But if you're one of my old blog pen pals I would always welcome an e-mail from you at the postmaster@tomstechblog.com address

Contact

- E-Mail Tom

Search

Subscribe

- Subscribe to this Blog

Calendar

<<  June 2013  >>
SuMoTuWeThFrSa
2627282930311
2345678
9101112131415
16171819202122
23242526272829
30123456

Archive

Tags

Categories


Blogroll

    Disclaimer

    The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

    © Copyright 2013

    Sign in