I had a COBOL moment last night
I had a college computer science professor who used to say he would get himself cryogenically frozen and when the time came to reanimate him, as long as COBOL was still around he would be happy.
I was at a NWEN gathering last night and one of the speakers was very clever and started by saying something to the effect of "if you haven't heard of this, you're already behind". That sort of statement kind of triggers a weird primal instinct in a computer person I think :)
Anyways, the topic was Hadoop, an open source implementation of the map/reduce architecture that Google uses internally.
I really have not done any type of clustered computing before, but when combined with Amazon's Elastic Computing service, it does become a lot more approachable. I'm working my way through this article on combining Hadoop and Amazon's web services.
That article uses web server log parsing as an example. It's an interesting example because parsing a web server log is simple enough, but the time it takes to do so increases based on the size of the log file obviously. With things like Hadoop, however, you just have to add more computing power to scale with the size of the input. When you use Amazon's EC2, that is literally just a matter of firing up a few more of their computing nodes and paying a few more cents per hour. In theory, Amazon EC2 has an infinite amount of computing power; apparently they have a datacenter bigger than Rhode Islande :)
I wonder if there are any scenarios in the ALM field that would lend itself to this type of computing. It seems like you have to re-train the way you think a little bit. Like what did you ever want to do but didn't try because the effort would not scale?
There must be some really interesting things you could do with code analysis if you applied this type of programming model to it.
For example, some of the code analysis solutions that we ran internally at Microsoft took a few days to run. I'm not sure how they were implemented, but I wonder if a Hadoop like model were run against it if that type of deep analysis could be done in real-time?
Anyways, just thought I would share.
Thanks!
Eric.
Labels: Hadoop

