Less Files, More Miles: Stream Your Media from MongoDB

Use PHP and MongoDB GridFS to handle images and videos. Fewer files in your file system, plus the utility of cloud services.

Recently, I’ve been working with MongoDB and GridFS in PHP to upload images and videos and stream them back out on the fly. Using the GridFS lets us bypass our own filesystem so we wouldn’t have to take up space on our own servers. While it may seem moot when you upload a few images, the point is really driven home if you have a media library of many large sized videos.

For those who are not familiar with MongoDB, it is an open-source NoSQL database,
and GridFS is a built-in feature used to store large files into MongoDB. Now that I’m more comfortable with Mongo and the PHP driver, it is a great way to store and stream media. Here’s how:

Storing images and videos into the GridFS

First, you’ll need a connection to the GridFS for your MongoDB. Let’s call this file saveFile.php;

<?php
// connect to the ‘myGrid’ GridFS
$m = new Mongo();  // gets a connection to Mongo
$db = $m->myDB;   // gets connection to database called "myDB";
$myGrid = $db->getGridFS('myGrid');  // gets a GridFS;
?>

To store a file into the GridFS, you need to call the storeFile method.

<?php
// store a file into the GridFS
$myGrid->storeFile($some_file, $data_array);
?>

You might want to store some information along with your file, such as timestamp, mime type, etc. so your data array can look like this:

<?php
// some extra data you may want to store with your file
$data_array = array(
    'mime' => mime_content_type($some_file),
    'timestamp' => time(),
    'metadata' => array( $more_meta_data ),
);
?>

When you store a file into the GridFS, it creates two collections: myGrid.chunks and myGrid.files. myGrid.files contains the metadata for your file, while myGrid.chunks contains the chunked media data.

Fetching files from the GridFS

In order to get a handle to the file that you stored, you will need to query your GridFS.
So in another file, called displayFile.php

<?php
// connect to the ‘myGrid’ GridFS
$m = new Mongo();
$myGrid = $m->myDB->getGridFS('myGrid');
$query = array( $some_query_criteria);
$myGrid->find($query);
?>

Your $query variable will be the criteria to which you want to filter through your files.
Generally, if no checking is done, then the MongoID will be the only unique key, so let’s pull a file out using a known MongoID.

<?php
// get a reference to your GridFSFile
$query  = array( '_id' = new MongoID($known_mongoID), );
$grid_files = $myGrid->find($query);
?>

The query will return an array of objects so you have to iterate through them to get its data.
These objects are just database file objects and not the actual media content.
So: let’s display the media directly from the gridFSFile.

Stream the content of your GridFSFile

Get a reference to your GridFSFile and call the getBytes() method to stream the chunks into your browser. So in the same displayFile.php file:

<?php
foreach($grid_files as $grid_file){
    // access your myGrid.files collection data
    $file_meta = $grid_file->file;

    // set the header type of what the file is;  this is where
    // the 'mime' => mime_content_type($some_file)
    // from before comes into play
    header('Content-type: '. $file_meta[‘mime’] );

    // if the file is a video, then the Content-Length header
    // must also be set; images don’t need it
    header('Content-Length:'.$file_meta[‘length’] );

    echo $grid_file->getBytes();
}
?>

Now if you visit the page displayFile.php you will be able to see your image or video streamed directly from your GridFS. No need to use your own file system!

The entire code:

<?php
/* * * saveFile.php */
// connect to the ‘myGrid’ GridFS
$m = new Mongo();
$db = $m->myDB;
$myGrid = $db->getGridFS('myGrid');

// some extra data you may want to store with your file
$data_array = array(
    'mime' => mime_content_type($some_file),
    'timestamp' => time(),
    'metadata' => array( $more_meta_data ),
);

// store a file into the GridFS
$myGrid->storeFile($some_file, $data_array);
?>
<?php
/* * * displayFile.php */
// connect to the ‘myGrid’ GridFS
$m = new Mongo();
$myGrid = $m->myDB->getGridFS('myGrid');

$query  = array(
    '_id' = new MongoID($known_mongoID_as_string),
);
$grid_files = $myGrid->find($query);

foreach($grid_files as $grid_file){
    // access your myGrid.files collection data
    $file_meta = $grid_file->file;

    header('Content-type: '. $file_meta['mime'] );

    /* not needed if image */
    header('Content-Length:'.$file_meta[‘length’] );
    echo $grid_file->getBytes();
}
?>

To get further utility from this, you can set the source of your image or video to this url and it will display accordingly in your webpage:

<!-- if your file is an image -->
<img src = “displayFile.php”/>

<!-- or if your file is a video you can use the html5
     video tag to stream it -->
<video width="320" height="240" controls="controls">
<source src="displayFile.php"
        type="<?php echo $file_meta['mime'];?>" />
Your browser does not support the video tag.
</video>
<!-- * Note: the video example assumes you have a reference to
     the mime type of your file -->

And there you have it: save files to MongoDB and stream them onto your webpage. You can get more information at: http://www.mongodb.org/ and find the PHP driver API here: http://php.net/manual/en/mongo.gridfs.php

Weekly Link Roundup – November 7, 2012

Weekly Link Roundup – October 31, 2012

Weekly Link Roundup – October 17, 2012

Weekly Link Roundup – October 10, 2012

Weekly Link Roundup – October 3rd, 2012

Weekly Link Roundup – September 20, 2012

Thinking in the Command Line

Thinking in the Command Line

A topic that receives a lot of scrutiny in the web development community is just how much time we spend focusing on our tools. With things like version control, templating languages, css preprocessors, testing frameworks, dependency management, automation and a handful of javascript libraries at our disposal, the web is a very different place to work in than it was only a few years ago. Although the amount of tools at our disposal as developers can seem staggering, it’s incredibly important to understand the tools at our disposal and always try to speed up or automate tasks in our workflow that are time consuming or repetitive.

It should be noted, however, that fiddling with your tools can be counterproductive (i.e. “let me change that font, color and taxonomize my folders based upon what mood i’m in today!”) and I’m not condoning tooling just for the sake of tooling.

One of the biggest changes I’ve made to the way I work since starting at Krate has been doing as many tasks as i can through the terminal. I am by no means a pro at the Unix command line and am still not. Running as many tasks and tools from the command line is something that will speed up your workflow as a developer exponentially because of the speed and flexibility that is involved.

Oftentimes, there is usually a GUI (graphical user-interface) version of the tool you use. Good examples of this are Git Tower and Codekit. In my experience, I’ve learned that GUIs can be very limiting. Git Tower and Codekit are graphical versions of the command line tools they make easier to use. Because these GUIs are built for making Sass, Coffeescript, Git, JSHint, etc. usable without command line knowledge, you’re letting the developers of Git Tower and Codekit make decisions on what you can do with those tools. However robust these apps may be, they are still just a layer on top of a technology they themselves did not build.

A parallel to this issue would be why most professional developers wouldn’t choose to build websites using Adobe Dreamweaver anymore. Many developers start out using this program but over time realize that writing code is simply much faster than picking through an interface to build things. When you can customize and work much faster you are able to think in much different ways than previously.

Without going into what something like what "sass --watch -t compact .scssfile: .cssfile" does, the real focus is simply on how using the command line affects your thinking as a developer. It changes the way you think about the tools you use. Suddenly, they aren’t these black boxes that do magical things. Rather, they are often open source repositories that you can learn from and contribute to.

I highly recommend reading Rebecca Murphy’s article and watching Paul Irish’s video on this subject.

Weekly Roundup – September 13, 2012

Wednesday Weekly Roundup – September 5, 2012