gotube is a library capable of obtaining youtube metadata through a url, this is possible because the library downloads the html and searches inside the hidden youtube json data.
code -> Repository
gotube steps
- Download and html file from url
- Look for the json inside the html file
- Parse the json into a go model to be used
Download html file
This function downloads the html, finds the json and parses it to be usable outside the library.
func GetMetaData(url string) (Video, error) {
id, err := ExtractQueryParam(url)
if err != nil {
return Video{}, fmt.Errorf("ExtractQueryParam failed")
}
resp, err := http.Get(fmt.Sprintf("https://www.youtube.com/watch?v=%s", id))
if err != nil {
return Video{}, fmt.Errorf("get video failed")
}
defer resp.Body.Close()
bodyBytes, err := ioutil.ReadAll(resp.Body)
if err != nil {
return Video{}, fmt.Errorf("failed parsing response body")
}
extractJSON := ExtractValue(string(bodyBytes), "ytInitialPlayerResponse = ", ";</script>")
var youtubeRequest Video
json.Unmarshal([]byte(extractJSON), &youtubeRequest)
return youtubeRequest,nil
}
Search json
the json is between 2 words, an initial and a final, this forces us to find a logic to be able to obtain the pure json and be able to parse it in go without problems.
func ExtractValue(source, startSource, endSource string) string {
var start, end int
if strings.Contains(source, startSource) && strings.Contains(source, endSource) {
start = strings.Index(source, startSource) + len(startSource)
end = Index(source,endSource, start)
return source[start:end]
} else {
return " "
}
}
func Index(s, substr string, offset int) int {
if len(s) < offset {
return -1
}
if idx := strings.Index(s[offset:], substr); idx >= 0 {
return offset + idx
}
return -1
}
Parse json into a go model
the json is long, to facilitate the conversion it is better to use this website https://mholt.github.io/json-to-go/