Google & Microsoft Working Towards the Perfect Datacenter

We all new that this would happen, google and microsoft going vying to build the biggest field of silicon trees.  But what does this mean, and does it tie in with amazons latest service?!  I think that undoubtedly it does.

There’s talk about a last man standing game when it comes to internet bandwidth.  And I can imagine a time when we might see the internet behaving like the freeways in L.A. at rush hour.  But this is more, I think.

I’ve mentioned before that the whole goal here is to “be the internet”.  I don’t think that goal has changed recently.  Google has sown the world two things:  First that there’s a vast amount of power to be wielded by being “the internet” to the average Tom Dick and Harry, and Second that the title is *always* up for grabs.  A while back Yahoo! was the internet, before that AOL was the internet, before that newsgroups were the internet.  Need I say more?  And each of those companies wielded an extreme sway over the comings and goings of the internet.

But now the internet means a lot more than it used to.  Now the internet is sales, it’s revenue, it’s marketing, people are watching, people are reading, people are listening, and– most importantly — people are being influenced by this “new fangled internet thing”, “oh, you mean Google?”

So there’s now a lot more riding on who gets to “be the internet” these days.  The one thing that ginormous corporate entities can’t seem to get a hold of is the fickle way in which the internet is backwards from real world businesses.  In the real world it’s all too common for a newcomer to storm into a market, take hold of it with genuinely better product, and then let all that slip away into mediocrity and poor quality.  And the kicker is that people will *still* pay for it if it’s crap… as long as its tangible. But the internet is fickle. It’s sort of tangible but more or less ethereal.

I think for the first time people outside the scientific communities are getting wind of a crazy idea: insubstantial value.  That is something that didn’t have value a minute ago, wont have value a minute from now, but at the moment is extremely valuable.  Which, inherently, means that this thing has the constant need to justify itself.  I’m no economics guy, and I’m certainly not in touch with the “average Joe” (who would almost certainly not follow me through more than two or three blog posts) but I think the difference here is that there’s no physical reality to intimidate us.

We don’t have to grow particularly attached to anything on the internet because it’s not “in our lives” we’re in its life. It doesn’t take up space in our house, we take up space in its house. For once in our lives we find that we aren’t the ones who are at the mercy of demand, but are – in fact – in demand.  It’s a feeling of empowerment that is slowly but surely changing the world. Mark my words children n classrooms 100 years from now will be studying the historical impact of all of the events which are happening before our eyes at this very moment, in this place that’s not a place.

I think I’ve become side tracked.  Oh yes, consumers being in demand, corporations unable to handle the discrepancies of the actions of the same people online and off line, and… Ahh yes… The underdog.

Why, do you think, it is that in this virtual world so often it’s the couple of guys who met in college coding outside a cafe, or this dude in his moms basement, or a couple of people who tried to do one thing but failed fantastically into doing something else completely right?  Because people of talent are, all of a sudden, relinquished of the necessity to offer anything physical… People with a talent for the ethereal, all of a sudden, have a place in which the ethereal acquires value.

And, as in any underdog story, these small (sometimes rare) meteoric rises to the top will carry others with them.  And these are the kind of people who remember the hands that helped them up.

So, sure, bandwidth and all that.  But the people who make it easiest for those suited to developing the intangible will have everything to gain in the long run. Amazon sees this, and is doing an amazing job with it.  Their recent successes with S3, SQS, and EC2, are testimony to their understanding of this new ecosystem.  But they ought not to think that Google and Microsoft haven’t noticed this and where the young blood is heading.

Make no mistake, amazon has made extremely agile, grassroots, moves to “be the internet” from the bottom up… But there will soon be a clash of services as G and M do the same from the “top down” and “sideways in” respectively.

I will say this: The first company to crack the database problem will have a distinct advantage in the struggles to come.

Disclaimer: Everything I just said is more than likely to be complete nonsense as I just kind of rambled it out “stream of consciousness” style .

This is a very interesting idea indeed

so and so over ad base4.net wrote about going “beyond stateless” by using method-less objects, and I found it interesting. This thing that intrigued me after reading the article was something completely different: the idea of using the client to store their own data.

I’ve often thought to myself that a form pf public key encryption should be used for web authentication… removing the hassle of the user name and password altogether… But why not take it a step further, and use it for encrypting the data? You could then have the client store the data for you and transmit it back over the wire when necessary.

I’m not talking about anything like Flickr not saving images on their servers here, I’m talking about things like contact information, notification settings, online social relationships, and preferences. Obviously not all data would be storable in this format, but the biggies could be name, social, email, credit card numbers (preferably with different keys so that you were able to delegate access on a per detail basis: None, Name, Contact info, Payment Processing, etc.

All it would take is a very lightweight fast client store (a la OpenLDAP which reads faster than it writes) and reversible encryption.

Now this would be a disclaimer: “We value your privacy, and therefor do not keep any of your personal information, preferences, history, or other records on our servers. That data is stored on your computer in a 2048 bit encrypted form. Therefore if a hacker were to penetrate our servers they would find absolutely no information which could be used against you”

The (theoretical) web services database

I’ve been kind of floating around this topic for a while… Well databases in general… And I see a lot of people who have rather high standards (which is not a bad thing.)  I imagine the complication of offering a service like this comes from the fact that database people have very stringent standards.

Things like ACID transactions, Foreign keys, Table/Row/Column/Field read/write locking, always come up in these types of conversations.  I suppose that this is so because it’s been the standard for so long… It’s just how people *think* about databases… Which means that its what databases should be, right? Right?

Well not long ago the people at Amazon rethought process communication, and rethought storage, and then rethought servers.  Perhaps its about time they rethought the database as well.  I have a hunch (as others have noted here before) that they already are!

I really think that a lot, and I mean a LOT, could be done with a very simple model.

  1. Tables are their own island (no foreign keys)
  2. simple auto Incrementing PK’s
  3. every column indexed
  4. only simple operators supported ( =, >, < , !=, is null, is not null )

Heresy! Ack! Foo! Bar! NO! THATS NOT A REAL DATABASE.  Well, no, not as you mean by “real database” but it certainly is a database.  And I expect it would be good enough for 85% of peoples wants, needs, and desires.

We’ve learned that delays in storage give us permanence.  We’ve learned that the pipeline is a good (and global) thing, and we’ve learned that impermanence gives us expandability.  Necessity being the mother of invention I expect that something like this will be out soon, and I expect that people will learn to be perfectly happy with it.  It’s all about flexibility and agility here people!
It’ll come, people will complain, it’ll work, and as time goes on, I think it’ll get better and better.

Distributed MySQL Via Web Services?

Imagine for a moment, if you will, making your MySQL queries via a REST API. Weird, huh? I’ll admit its a crazy idea, but then a lot of my ideas are crazy. Still. Work with me here.

Query –> || REST API ||

  1. The query is a select
  1. Rest API synchronously determines both which servers are up and which is the fastest to respond.
  2. The API connects to the server with your user name and password (specified in the request header)
  3. The query is run on that server, and the response is passed back through to you.
  4. Connection closed
  • The query is an Insert/Update/Delete
    1. Rest API synchronously determines both which servers are up and which is the fastest to respond.
    2. The API verifies your credentials against that server, and gives you a Query ID
    1. You can then re-query the API with the Query ID to determine if the query has been fully replicated.
  • The API writes the query into replication directories, a la slurpd
  • The query is then passed along to all of the real MySQL servers
  • Plenty of details to iron out here, but it’s certainly feasible… And definitely interesting…

    One Resource to Rule Them All!

    One resource to rule them all,

    One resource to find them,

    One resource to bring them all,

    And in the darkness bind them,

    In the land of server where the shadows lie.

    It’s been a bumpy road to peoples understanding of the EC2 service. And a large part of the problem is a point of view gap between the masses, and Amazon. It’s a lot like an American visiting India wondering why he cant order a steak (disclaimer: I don’t actually know whether you can order a steak in India, but the point essentially remains.) They have a different point of view in regards to the cow.

    So too does amazon have a different point of view on resources. Your average web guy sees a server as a single resource: “that server is very powerful it could do a LOT” or “thats an old server, not a lot can be done with it” Because for so long we was able to get X number of servers, those servers would be assigned roles, and thats what they were. A better server could crawl more pages, or store a larger database, or serve more page views. And of course this meant that the server was specific to the application. But this model gets more and more difficult to maintain as the project gets larger and larger. Anyone who’s gone from 15 to 85 servers knows this. And it boils down to one single point: Permanence does not scale.

    So the amazon guys decided to look at things differently. Your basic components of a server are Mhz, RAM, Bandwidth, and disk space. And they look at a server as a pool of those specific resources. You don’t have 15 good servers, you have 180,000 Mhz, and 120 Gb of ram, and 13,500 Gb of disk space.

    And since permanence doesn’t scale… permanence is built OUT. This is a difficult concept to grasp for most people, and building an application which doesn’t rely on permanence is difficult (myself included!) It’s a learning process, but a necessary one. Once people learn to put permanence in the right places — once we all figure out the tricks of the trade I am of the opinion that the web as a whole will become a much more stable place.

    There certainly will be some growing pains though. For example right now a huge pain the dependence on popular database products (MySQL, PostgreSQL) which are wonderful, don’t get me wrong, But they are, currently, limited to the realm of the server, instead of the realm of the cloud
    So lets all put our heads together and start thinking of ways in which we can make use of the cloud as a cloud. We can do this!

    Down with HTML E-Mail!

    Begin rant

    I’m with Jeremy on this one… Lets face it, e-mail is broken.  We have long since outgrown it, we have been living with the pains of ot for a long time now.  It’s everyones favorite internet whipping boy. “I hate spam” “I hate stupid forwards” “I hate huge attachments”.  We spend all our time bitching about e-mail but them when something happens it’s “the sky is falling the sky is falling give me back my good sweet innocent e-mail the way it was before you broke it! It was JUST FINE THE WAY IT WAS WHY DID YOU HAVE TO CHANGE IT?!”

    Go whine to somebody else, seriously. EMail is the black plague of the internet, its an infectious disease, a self sustaining spiral down the drain of absurdity. I, for one, will be happy when all of the people who depend on it, and who enable it, and who empower it finally go retire on some island somewhere and the kids take over and it’s all about text messaging, not e-mail.

    Speaking of kids taking over: “SUCKS TO YOUR EMAIL!”

    End rant

    ORDB gone…. Bummer….

    ORDB seems to have closed its doors. Thats huge — and sad. fare well ORDB. I wish I had more info about this one. If anyone has anything to add (or clarify on the subject) I would appreciate it being left in the comments. By the time I saw this their page had gone down, but I did find a reproduction on The Spam Diaries: ORDB Blocklist Gone.

    2006-12-18 11:34
    We regret to inform you that ORDB.org, at the ripe age of five and a half, is shutting down. It’s been a case of a long goodbye as very little work has gone into maintaining ORDB for a while. Our volunteer staff has been pre-occupied with other aspects of their lives. In addition, the general consensus within the team is that open relay RBLs are no longer the most effective way of preventing spam from entering your network as spammers have changed tactics in recent years, as have the anti-spam community.

    We encourage system owners to remove ORDB checks from their mailers immediately and start investigating alternative methods of spam filtering. We recommend a combination involving greylisting and content-based analysis (such as the dspam project, bmf or Spam Assassin).

    DNS and the mailing lists will vanish today, December 18, 2006.

    This website will vanish by December 31, 2006.

    Apparently this is old hat, and I’ve just found out, finally, due to the drying up of my DNS servers caches… Still. So long and thanks for all the fish!