Jose Pastor/gotube

Created Sun, 17 Oct 2021 11:26:14 +0000 Modified Sun, 08 May 2022 23:49:02 -0500
278 Words

gotube is a library capable of obtaining youtube metadata through a url, this is possible because the library downloads the html and searches inside the hidden youtube json data.

code -> Repository

gotube steps

  1. Download and html file from url
  2. Look for the json inside the html file
  3. Parse the json into a go model to be used

Download html file

This function downloads the html, finds the json and parses it to be usable outside the library.

func GetMetaData(url string) (Video, error) {
	id, err := ExtractQueryParam(url)
	if err != nil {
		return Video{}, fmt.Errorf("ExtractQueryParam failed")
	}

	resp, err := http.Get(fmt.Sprintf("https://www.youtube.com/watch?v=%s", id))
	if err != nil {
		return Video{}, fmt.Errorf("get video failed")
	}
	defer resp.Body.Close()
	bodyBytes, err := ioutil.ReadAll(resp.Body)
	if err != nil {
		return Video{}, fmt.Errorf("failed parsing response body")
	}
	extractJSON := ExtractValue(string(bodyBytes), "ytInitialPlayerResponse = ", ";</script>")
	var youtubeRequest Video
	json.Unmarshal([]byte(extractJSON), &youtubeRequest)

	return youtubeRequest,nil
}

Search json

the json is between 2 words, an initial and a final, this forces us to find a logic to be able to obtain the pure json and be able to parse it in go without problems.

func ExtractValue(source, startSource, endSource string) string {
	var start, end int
	if strings.Contains(source, startSource) && strings.Contains(source, endSource) {
		start = strings.Index(source, startSource) + len(startSource)
		end = Index(source,endSource, start)
		return source[start:end]
	} else {
		return " "
	}
}

func Index(s, substr string, offset int) int {
	if len(s) < offset {
		return -1
	}
	if idx := strings.Index(s[offset:], substr); idx >= 0 {
		return offset + idx
	}
	return -1
}

Parse json into a go model

the json is long, to facilitate the conversion it is better to use this website https://mholt.github.io/json-to-go/