I have X number of data objects with up to about min 45,000, max 50,000 possible values associated with each object. I know… Don’t ask (couldn’t tell you anyways). Now to be able to do this in MySQL is… well… possible… but absurd. I’m thinking of trying out the approach I’ve mused about here. It could possibly be a really great way to manage finding commonalities across tens of thousands of objects with a total of hundreds of millions of values. Or it could be a massive time sink.
It would also put some of the things that Amazon has said about their S3 service to the test 🙂 I doubt anyone’s really stored a hundred million objects in an S3 bucket and been concerned enough with seek time to be critical about it 🙂
Or am I missing some magic bullet here. Is there a (free) DBMS I’m not thinking of which handles 50,000 columns in a table with a fast comparitive lookup? (select pk from table where v42000 = (select v42000 from table where pk = referencePk))… I’d love for someone to pop in and say “HEY, STUPID, DB XXX CAN DO THAT!” 🙂 assuming DB XXX isnt expensive 🙂
Hmm… Off to ponder…
Can you just save a serialized version of the object in a blob field with the key?
If you need to be able to sometimes query on these 50,000 attributes, serialize it as XML and then try postgres+XML extensions (http://www.throwingbeans.org/postgresql_and_xml_updated.html) and let me know if it works….