Amazon interview question

storing binary file

Interview Answer

Anonymous

Jan 8, 2019

Ways to store a binary file depends in the file sizes and the usage. Very small binary files may be stored in blob or FS storages. Big binary files are rather being chunked to smaller pieces (e.g. 4 M a chunk), the advantages in that are that when delivering these files to consumer applications, delivering in chunks allows continuing from a point of failure, similar when the files are uploaded to the server, it can continue from a point of failure. In case that the files are part of video upload and replay system it allows provide smaller chunks to replay and easily go back and forth with the offset in a file. It also allows to easily hash the chunks and implement deduplication. For content delivery systems (like Netflix) popular files should be also stored on CDN servers to allow geographically-based content delivery. Metadata for the files should be stored separately in order to store information about the files and easily navigate while searching for specific files. For high availability and reliability, we shall have redundancy for all stored files.

1