My implementation of facebook's haystack storage solution. It's certainly not a faithful recreation, but I think this se

Ranter

AlgoRythm

49881

Comments

8

monr0e

1229

6y

Ew. Why.

Haystack is incredibly restrictive. To edit a file, you have to open and extract the entire stack. Or, you have to remove the old file and add the new one to the end of the stack. Or version the file and index the versions, leaving all the old files in permanent storage.

You now have a working emulation of the first iterations of cassette storage. 1959 says hello.
2

AlgoRythm

49881

6y

@monr0e It has it's places. For example, the Facebook engineer said that deletion was incredibly rare, so they were not concerned by it.

I'm using it for a media server. I don't even really have plans to allow for media to be removed from it, much less edited.

Remember that a server can have more than one data solution too. Use a haystack for some stuff, database for other stuff, flat files for the other stuff, etc.
3

monr0e

1229

6y

@AlgoRythm grumble grumble wheel something-or-other.

A scenario in which edits rarely happen is astoundingly rare. Facebook most likely has a shit-tonne of issues with it that don't get addressed because the management structure and attitude there is horrific. Aren't you going to add play counts to your media files? Or perhaps metadata that is automatically collected when you place media on it? What about if you add new storage to your media centre and have to edit your location strings to reflect it? My point is, reel storage was abandoned because it couldn't adapt to an environment that nobody expected to change so quickly. Haystack is the same, but in an era where implementing it is absurd.
0

AlgoRythm

49881

6y

@monr0e view count and metadata is all in the database. The raw media is just stored in haystacks because it's convenient! Especially considering this is designed to be a household server not a distributed one.
0

monr0e

1229

6y

@AlgoRythm why? Do you really have that much media? And are you going to build container, codec and compression attributes into a separate db in a fashion that a modern media player can read?
0

AlgoRythm

49881

6y

@monr0e each episode is uploaded as a separate mp4 file, and when it gets to hundreds of episodes, the file structure gets ugly. Just a directory full of (datetime).mp4

It works well for what I need, and it solves an issue I had. And I had fun implementing it.
3

monr0e

1229

6y

@AlgoRythm I can taste vomit.

OK, in all seriousness, there's no reason this doesn't work. However, you best be damn sure you have some fault tolerance built in, since locking a file of that size open for that long feels like a recipe for drive failure. Unless, of course, you're opening the media entirely in memory, in which case I hope you are made of money, given the size of even h265 nowadays.
1

magicMirror

10322

6y

@AlgoRythm "deletion is incredibly rare"
😱😱😱😱
deletion never happens for facebook, so it makes sense to store "readonly"+"small sized" file in this horrendus way.
It also make duplication across network much simpler, and reduces the inode load on the hd.

never, ever, use this for files you want to modify.
0

KDSBest

732

6y

Custom FS seems more reasonable
0

not-user-telken

23

6y

@AlgoRythm im intrigued, are all the files (roughly) the same size?
0

AlgoRythm

49881

6y

@not-user-telken Facebook or mine? Facebook yes mine no
0

not-user-telken

23

6y

@AlgoRythm was asking for yours, but extra info is appreciated. How do you handle moving indexes on insertion or deletion? Just like an array?
0

AlgoRythm

49881

6y

@not-user-telken indexes won't move, on deletion they would just get marked as deleted. If you actually wanted to remove the data to reclaim the space it would work the same as an array.splice
0

not-user-telken

23

6y

@AlgoRythm and no insertion just append? Also, for the purpose intended, i'd guess there is no "editing" files in haystack
0

AlgoRythm

49881

6y

@not-user-telken No, it's just a "throw it all in a big pile" sort of solution.

Add Comment

rant