This brings up an interesting issue of scaling. Most larger sites would use a relational database for the backend store (right or wrong) and scale the frontend and backend independently. Because news.yc is using in-memory and disk storage, what are the options available to scale this type of backend architecture?
Assuming load was the issue earlier (which it most likely wasn't) would you move the store to something like a Berkley DB with a network accessible frontend or the newer Memcachedb? Maybe a relational database or even Amazon's new db webservice. If this site was your startup, what would your next move be?
In-memory, in-process storage is by far the fastest option (assuming you know a hashtable from a granola bar). A single server can easily handle 1000+ requests/second, which is plenty. The only reason not to do it is if your data doesn't fit in RAM, which is probably not the case for News.YC.
This is a scaling question. Clearly the question is, assuming the current set up is no longer enough, what would be the best way to scale this architecture.
Probably the easiest would be to use memcache for some of the repeated queries, specifically the items that are relatively static, such as the original post or posted item details.
It probably wouldn't help too much on the front page, since that'll update constantly and therefore just be a waste of overhead.
This doesn't really address the issue that there's no centralized information repository. With the content entirely in RAM, to distribute the app to two boxes would require either some sort of message broadcasting system, or re-architecting the data to be stored in some other way.
Agreed that in-memory storage is by far the fastest form of storage available to a single server setup like this. However, I was thinking hypothetically if load was the issue, how would people scale the backend. I was curious if one option would emerge as a favorite.
I disagree that the only reason not to do it is if the data won't fit in RAM. Going to a multi-server setup has many advantages including redundancy for situations such as these, of course, at the cost of increased administration.
Later: Problem seems fixed. The server had been running for a comparatively long time, maybe two weeks. Presumably it (or Mzscheme) ran out of something. It seemed more important to make the site work than figure out what.
i don't know about the nature of the YC.news architecture, but if it's an error that happens unexpectecly then it must be due to some scheduled automatically recurring function that fails somehow and eats up all memory...
so probably it must be on the web server or the stats logging
update: or the weighting algorithm, which is obviously a new addition
Did anyone else lose their comments, karma, etc? I'm down to 1 point. My profile settings seemed to also be reset, though it still remembers my saved articles.
Ugh. Looks like some accounts got corrupted when I shut down the server. If anyone else sees similar problems, please let me know here, and I'll fix your account from backups.
Data got corrupted because of server restarts? That doesn't sound like a robust design. I can see how some sessions can get kicked out but data corruption should not occur.
It didn't seem to happen to me but I feel bad for others who didn't realize their karma dropped.
I get the "unknown or expired link" error almost every time I comment... probably because it always takes me 10 minutes to sift through the muddled essay I write to myself before figuring out what I actually want to say.
we were conversing a bit earlier and hypothesized it was a ddos from the spam thread discussed yesterday. was also thinking that maybe a spam filter put in place had a bug.
Assuming load was the issue earlier (which it most likely wasn't) would you move the store to something like a Berkley DB with a network accessible frontend or the newer Memcachedb? Maybe a relational database or even Amazon's new db webservice. If this site was your startup, what would your next move be?