nitrxgen
YouTube CC Downloader - Information

This document will try to explain as much about the YouTube CC Downloader as possible. From the way it all works to how best to use it to get the most from it.

Go to: YouTube CC Downloader.

Development History

Things to complete:
The optional choice to download all the closed captions in handy .rar or .zip form. Perhaps show the title and the video thumbnail. Updating the database properly (hit counting, title, user).

Application Program Interface

Because this tool is meant to help people access YouTube CC tracks a bit easier, what's a better way to achieve that by providing a nice easy-to-use API?

Because there can be multiple CC tracks per YouTube video, there are two main steps to accessing them using the API. This is the same for communicating with YouTube directly, except I hope this API provides a much more user-friendly experience. The first step is to see what CC tracks are available for the video. The second step is to download whichever track you so desire.

The first step:
You need to see what CC tracks are available for the YouTube video. You may do this by accessing the following URL:

http://www.nitrxgen.net/youtube_cc/AuX7nPBqDts.csv

You can replace AuX7nPBqDts with any YouTube video ID provided that the video exists, is not set to private, and actually possesses CC track data. Additionally, you can replace .csv with .tsv, .xml, or .json to get the available track data in different formats.

The most important thing about this first step is deciding which CC track you want, and using the associated track ID for the second step. I'm relying on the assumption that you know where to find the track ID once you've requested this URL; if you do not, then I don't know what you're doing here.

The second step:
Using the data retrieved from the first step (using primarily the track ID), you can download the CC track directly using the following URL:

http://www.nitrxgen.net/youtube_cc/AuX7nPBqDts/9.csv

You'll notice only a minor addition with this URL: The video ID AuX7nPBqDts is a directory, and 9.csv appears to be a file. The file's name 9 is actually the track ID you're requesting. Once again, you can replace .csv with .sbv, .srt, .tsv, .txt, .vtt, or .dfxp to retrieve the data in different file formats.

Things to note:
I must point out that some of these different file format specifications allow for different features such as text formatting, on-screen text positioning, and extra attributes such as spoken and non-spoken event data (like attributing the name of the person speaking in particular, screaming, doors slamming, noises off-camera, background music information, etc.). Unfortunately these extra things are not available from the data YouTube provides.

I should also point out that some CC tracks may not have correct timing information used, or may equal 0.000 seconds for start and finish for all lines, or something similar. This is not an error with this tool, that is the author of the CC track not applying proper care to ensure the track has the correct times. There is nothing I can do about these as this is what YouTube also provides. Try enabling the CC track in the actual YouTube video and see if they actually work; for me, they do not.

Problems on YouTube:
Their language code BH (for Bihara) does not actually have Bihara in the track list of the API; it only returns Bh for both the original language name and the translated language name. The Chinese languages in the user-friendly YouTube list do not match those found in the track list of the API either. When attempting to add captions to a video using the Moldovan language, it allows you to enter or upload caption data, but upon returning to the language list to add more translations, you'll find Romanian (Moldova) instead, and it thinks captions for Moldovan hasn't been published. Because of this mismatch in identification, when trying to edit Romanian (Moldova) captions, you get a YouTube error and subsequently you can't remove those captions. In general, a couple of YouTube languages are actually the country names and not the languages (Nauru should be Nauruan, Quechua should be Quechuan but is a collection of languages). The Haitian language is not visible to add (for me, at least) even though I've seen Haitian captions added to videos. Serbo-Croatian becomes Serbian (Latin) which already exists as itself, but at least you can delete that one. Tagalog becomes Filipino which is unnecessary. Twi becomes Akan and then is unable to be edited or deleted. I've seen Scots listed by YouTube videos, but only Scottish Gaelic appears, which uses a different language code.


Frequently Asked Questions

The information in this section is up to date as of 26/01/2017.

  • “When will automatically generated captions be available to download?”

    I did originally intend to provide automatically generated captions when I launched this tool. Sadly, it appeared a bit more complicated than I imagined and simply didn't bother working on it. In reality, it's a whole different tool in itself due to the complexity. I do still plan to provide automatically generated closed captions at some point but I can't say when. One day I'll wake up and work on it. That's all I can say.

  • “The video URL I'm entering shows no closed captions. What's wrong?”

    There may be a number of things. Check that the video exists. If it's your own video, make sure the privacy settings are not set to private. If it's not your video, perhaps there may in fact be no closed captions available for that video. It is possible that maybe regional restrictions can apply; videos that cannot be shown outside of the U.S. for example may also block access to the accompanying closed captions but I've not yet tested this. If it continues to fail and you are sure there should be closed captions to view then please let me know and be sure to include the video URL you're entering so that I can re-create the issue.


Statistics

Statistics are currently disabled due to the rather large sample size collected over the years. A new way to collect statistics will have to be developed. Sorry in advance!

© Copyright 2008-2024: Nitrxgen, all rights reserved.
Not XHTML 1.0 valid or CSS3 valid.
Source last modified 2407 days ago.