Hazem Allbabidi

November 30, 2024 | 5 min read


How GitHub Uploads and Secures Images

GitHub is a platform used by nearly all developers, storing millions of repositories full of files written in different programming languages. Each repository also comes with a good set of tools that help manage the project within that repository, tools like Issue Management, GitHub Actions (for CI/CD), and even a Project Management tab for easily managing the project.

One of the things I noticed while using GitHub was how they handle images and image security on their Issues tab. In there, you are able to add Markdown-based content along with the ability of uploading multiple images.

The way uploading is done is simple. You drag-and-drop (or click on the upload button in the editor) an image or type JPEG, PNG, or even GIF (you may also upload different file types like PDF but you won’t be able to view them on the Preview tab). This creates link, which you will see in the Markdown editor, that directs to a URL similar to this:

https://github.com/user-attachments/assets/<UUID>

If you navigate to the Preview tab, you should be able to see your uploaded image displayed there.

How did this happen?

There are essentially 3 things that happen when uploading an image.

The Flow

Step 1: Asset Creation Initiation

First, an asset upload request is initiated by sending a POST request contained information about the image, such as the file size and file name, to https://github.com/upload/policies/assets, which returns a 201 (Created) response with the following data:

(There are more values but the ones stated are the ones relevant for this article.)

The first value is simply the metadata of the file such as the name and URL, as well as the ID created for that asset, which we will use in a later step.

The second value is the asset upload URL for Github itself, this will be used (and explained) in a later step.

The third value is a special URL that points to the Amazon S3 bucket where the actual file will be uploaded. Incase you are unfamiliar with Amazon S3, it is basically an object storage system hosted by Amazon Web Services which allows you to upload files and manage them easily through their dashboard or API.

The fourth value includes information about the Amazon the Amazon S3 object and instance such as the path of the file, the file type, the date, and more.

Step 2: File Uploading

The next step is uploading the actual file to Amazon S3 through the URL we have seen in the previous step.

A request is sent to the File Upload URL provided by the first request with the following payload data:

This request returns no response except the status code, which is a 204 if the request was a success.

Step 3: File Authentication

The final step is sending a PUT request to the Asset Upload URL we have received in the first step, which informs the backend that the uploading process to Amazon S3 was a success. This request includes the Asset Upload Authenticity Token which we also received in Step 1.

Retrieving The Image

Once the image has been uploaded, you will be see the URL of the image injected at the location where you uploaded it in the Issue Description.

It will look something like this:

https://github.com/user-attachments/assets/<UUID>

If you click on the Preview tab, you will be able to view the image.

Try to copy the link and open it in two browser windows: the same window where you are using the Github account, and an Incognito/Private window.

You will notice that in the first window, you will be redirected into a website thet previews the image, the URL of that website is the Amazon S3 URL for the image.

But in the Incognito/Private window, you will simply get a 404 page from Github directly, why is that?

File Retrieving Authentication

There are multiple ways in which a website can choose to authenticate their users, some use sessions, others use the local storage, and most use cookies.

And when a request to the backend or API of the website is done, the authentication token is sent along with the request.

The great thing about authenticating using cookies, is that any request being made to the same URL (e.g. you are on Github.com and it makes a request to itself at Github.com), the cookies are automatically included in the request, without any modification in the request.

What makes this even better, is that you can use this method to view secret images, such as in the case of Github.

What happens is you enter the image URL in the browser, a request is made to that URL with the cookies of that same website that are saved in the browser, if you are authenticated, you will be redirected to the Amazon S3 file URL which returns the image.

In the Incognito/Private window, the cookies are not saved, therefore when you try to open that URL on the browser tab, you get a 404 page, because you are not authenticated.

One thing to note here, is that Amazon S3 URL expires within a few minutes (try to refresh the page that returns the image after 10 minutes, you will get an error page stating that Access is Denied and it is because the Request has expired).

Conclusion

In conclusion, we learned a few things here:

  1. Github uses cookies for authentication (at least for images)
  2. Github uses Amazon S3 for file uploading
  3. Cookies can be used to authenticate image retrieval requests

I hoped you learned these things and more in this article. See you in the next one!


Previous

Open Source: MailHog for Testing Emails
Sign Up To Binance To Get 10% Off Commission Fees Sign Up To Kucoin