lyricsfandom¶
lyricsfandom.api¶
API and other classes to connect on Lyrics Wiki.
-
class
lyricsfandom.api.
LyricWiki
(verbose=False, sleep=0, user=None)[source]¶ Main API for Lyric Wiki scrapping.
It basically wraps
Artist
,Album
andSong
classes.-
get_albums
(artist_name, cover=True, other=True)[source]¶ Get all albums from an artist.
Parameters: - artist_name (string) – name of the artist to get.
- cover (bool) – if
True
scrape featuring or covers songs. - other (bool) – if
True
scrape remixes or compilation albums.
Returns: list(Album)
-
get_discography
(artist_name, cover=True, other=True, encode=None)[source]¶ Get the discography of an artist, in a JSON format.
Note
The returned dictionary is in a nested format.
Parameters: - artist_name (string) – name of the artist to get.
- cover (bool) – if
True
scrape featuring or covers songs. - other (bool) – if
True
scrape remixes or compilation albums. - encode (string) – encode the string text (ex:
encode='ascii
). Default is None.
Returns: dict
-
get_lyrics
(artist_name, cover=False, other=False)[source]¶ Get all lyrics from an artist.
Parameters: - artist_name (string) – name of the artist to get.
- cover (bool) – if
True
scrape featuring or covers songs. - other (bool) – if
True
scrape remixes or compilation albums.
Returns: lyrics in a JSON format.
Return type: dict
-
search_album
(artist_name, album_name)[source]¶ Search an album from Lyric Wiki server.
Parameters: - artist_name (string) – name of the artist who made the album.
- album_name (string) – name of the album.
Returns: Album
-
search_artist
(artist_name, cover=False, other=False)[source]¶ Search an artist from Lyric Wiki server.
Parameters: - artist_name (string) – name of the artist to get.
- cover (bool) – if
True
scrape featuring or covers songs. - other (bool) – if
True
scrape remixes or compilation albums.
Returns: Artist
-
search_song
(artist_name, song_name)[source]¶ Search a song from Lyric Wiki server.
Parameters: - artist_name (string) – name of the artist who made the song.
- song_name (string) – name of the song.
Returns: Song
-
set_sleep
(sleep)[source]¶ Time before connecting again to a new page.
Parameters: sleep (float) – seconds to wait.
-
lyricsfandom.meta¶
Base classes. They are optional and can be removed for simplicity. However, they provides a better API and sperates Artist / Album / Song in a better way.
-
class
lyricsfandom.meta.
AlbumMeta
(artist_name, album_name, album_year=None, album_type=None)[source]¶ Defines an Abstract Album from
https://lyrics.fandom.com/wiki/
.album_name
: album of the artist.album_type
: type of album.album_year
: released of the album.
-
classmethod
from_artist
(artist, album_name)[source]¶ Construct an Artist from an url.
Parameters: - artist (Artist) – artist to extract the album from.
- album_name (string) – album name.
-
classmethod
from_url
(url)[source]¶ Construct an Album from an url.
Parameters: url (string) – url of the album page.
-
get_artist
()[source]¶ Retrieve the artist class linked to the album (if it exists). It is usually called when an album has been searched from an
Artist
class. Then, using this function will point to the sameArtist
object.Returns: Artist
-
register_artist
(artist)[source]¶ Manually set the pointer to an
Artist
.Parameters: artist (Artist) – artist related to the album.
-
class
lyricsfandom.meta.
ArtistMeta
(artist_name)[source]¶ Defines an Abstract Artist / Band from
https://lyrics.fandom.com/wiki/
.artist_name
: name of the artist.artist_id
: id of the artist.base
: base page of the artist.href
: href page of the artist.url
: url page of the artist.
-
classmethod
from_url
(url)[source]¶ Construct an Artist from an url.
Parameters: url (string) – url of the artist page.
-
get_links
()[source]¶ Retrieve merchandise links from a Lyric Wiki page. If the page (and links) exists, it will save it in a private attribute, to avoid loading again and again the same links if the method is called multiple times.
Returns: dict
-
class
lyricsfandom.meta.
LyricWikiMeta
[source]¶ The
LyricWikiMeta
is an abstract class that all object pointing to Lyric Wiki web site should inherits. It provide basic set-up to connect and access to Lyric Wiki website.
-
class
lyricsfandom.meta.
SongMeta
(artist_name, song_name, album_name=None, album_year=None, album_type=None)[source]¶ Defines an Abstract Song from
https://lyrics.fandom.com/
.song_name
: name of the song.song_id
: id of the song.lyrics
: lyrics of the song.
-
classmethod
from_album
(album, song_name)[source]¶ Construct a Song from an url.
Parameters: - album (Album) – album to extract the song from.
- song_name (string) – song name.
-
classmethod
from_artist
(artist, song_name)[source]¶ Construct an Artist from an url.
Parameters: - artist (Artist) – artist to extract the album from.
- song_name (string) – song name.
-
classmethod
from_url
(url)[source]¶ Construct a Song from an url.
Parameters: url (string) – url of the lyrics song page.
-
register_album
(album)[source]¶ Link the song to a parent album.
Parameters: album (Album) – album to link to the song.
-
set_lyrics
(value)[source]¶ Manually set the lyrics of the current song.
Parameters: value (string) – new lyrics.
lyricsfandom.connect¶
A scrapper is used to connect to a website and extract data.
lyricsfandom.scrape¶
Functions used to connect, extract, and display data from lyrics fandom website.
These functions are used to scrape data from HTML
page connection. They are used inside Artist, Album, Song
classes.
The major part of this functions used a soup parameter, i.e. a Beautiful Soup
Tag
element
on a wab page (usually the whole page, not just a <div>
or other HTML
elements.
-
lyricsfandom.scrape.
generate_album_url
(artist_name, album_name, album_year)[source]¶ Generate a Lyric Wiki url from of an album page from its artist and name / year.
Parameters: - artist_name (string) – name of the Artist.
- album_name (string) – name of the Album.
- album_year (string) – year of an Album.
Returns: string
- Examples::
>>> artist_name = 'london grammar' >>> album_name = 'if you wait' >>> album_year = 2013 >>> generate_album_url(artist_name, album_name, album_year) https://lyrics.fandom.com/wiki/London_Grammar:If_You_Wait_(2013)
-
lyricsfandom.scrape.
generate_artist_url
(artist_name)[source]¶ Generate a Lyric Wiki url of an artist page from its name.
Parameters: artist_name (string) – name of the Artist. Returns: string - Examples::
>>> artist_name = 'london grammar' >>> generate_artist_url(artist_name) https://lyrics.fandom.com/wiki/London_Grammar
-
lyricsfandom.scrape.
get_artist_info
(soup)[source]¶ Get additional information about the artist / band.
Parameters: soup (bs4.element.Tag) – connection to a wiki artist page. Returns: dict
-
lyricsfandom.scrape.
get_external_links
(soup)[source]¶ Retrieve the different links from a Lyric Wiki page. The links returned can be found in the External Links page section, and usually references to other platforms (like Last.fm, Amazon, iTunes etc.).
Parameters: soup (bs4.element.Tag) – connection to the Lyric Wiki page. Returns: dict - Examples::
>>> # Import packages >>> import bs4 # for web scrapping >>> import urllib.request # to connect
>>> # Set Up: connect to a lyric wiki page >>> USER = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7' >>> HEADERS = {'User-Agent': USER} >>> URL = 'https://lyrics.fandom.com/wiki/London_Grammar:Who_Am_I' >>> req = urllib.request.Request(URL, headers=HEADERS) >>> page = urllib.request.urlopen(req) >>> soup = bs4.BeautifulSoup(page, 'lxml')
>>> # Retrieve links from the page >>> get_external_links(soup) {'Amazon': ['https://www.amazon.com/exec/obidos/redirect?link_code=ur2&tag=wikia-20&camp=1789&creative=9325&path=https%3A%2F%2Fwww.amazon.com%2Fdp%2FB00J0QJ84E'], 'Last.fm': ['https://www.last.fm/music/London+Grammar', 'https://www.last.fm/music/London+Grammar/If+You+Wait'], 'iTunes': ['https://itunes.apple.com/us/album/695805771'], 'AllMusic': ['https://www.allmusic.com/album/mw0002559862'], 'Discogs': ['http://www.discogs.com/master/595953'], 'MusicBrainz': ['https://musicbrainz.org/release-group/dbf36a9a-df02-41c4-8fa9-5afe599960b0'], 'Spotify': ['https://open.spotify.com/album/0YTj3vyjZmlfp16S2XGo50']}
-
lyricsfandom.scrape.
get_lyrics
(soup)[source]¶ Get lyrics from a Lyric Wiki song page.
Returns: string - Examples::
>>> # Import packages >>> import bs4 # for web scrapping >>> import urllib.request # to connect
>>> # Set Up: connect to a lyric wiki page >>> USER = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7' >>> HEADERS = {'User-Agent': USER} >>> URL = 'https://lyrics.fandom.com/wiki/London_Grammar:Shyer' >>> req = urllib.request.Request(URL, headers=HEADERS) >>> page = urllib.request.urlopen(req) >>> soup = bs4.BeautifulSoup(page, 'lxml')
>>> # Scrape the lyrics >>> lyrics = get_lyrics(soup) >>> print(lyrics) I'm feeling shyer and the world gets darker Hold yourself a little higher Bridge that gap just further And all your being I'd ask you to give it up An ancient feeling love So beautifully dressed up
Feeling shyer, I’m feeling shyer I’m feeling shyer
Maybe you should call her Deep in the night for her And all your being I’d ask you to give it up I’d ask you to give it up
-
lyricsfandom.scrape.
scrape_albums
(soup)[source]¶ Scrape albums tags, usually from the main artist wiki page. This function will successively yield albums.
Note
The function yield
<h2>
tags.Parameters: soup (bs4.element.Tag) – artist page connection. Returns: albums tags of an artist page. Return type: yield bs4.element.Tag - Examples::
>>> # Import packages >>> import bs4 # for web scrapping >>> import urllib.request # to connect
>>> # Set Up: connect to a lyric wiki page >>> USER = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7' >>> HEADERS = {'User-Agent': USER} >>> URL = 'https://lyrics.fandom.com/wiki/London_Grammar' >>> req = urllib.request.Request(URL, headers=HEADERS) >>> page = urllib.request.urlopen(req) >>> soup = bs4.BeautifulSoup(page, 'lxml')
>>> # Scrape albums >>> for album_tag in scrape_albums(soup): ... print(album_tag.text) Strong (2013) If You Wait (2013) Truth Is a Beautiful Thing (2017) Songs on Compilations and Soundtracks Additional information External links
-
lyricsfandom.scrape.
scrape_songs
(album_h2_tag, li_tag='ol')[source]¶ Scrape songs from an album. This function should be used to scrape on artist’s page. The optional parameter
li_tag
is used to specify whether or not to scrape for released albums ('ol'
tags) or covers, singles, live etc. ('ul'
tags). They can be combined usingli_tag=['ol', 'ul']
to scrape among all songs.Parameters: - album_h2_tag (bs4.element.Tag) – album tag. Only songs under this tag will be yielded.
- li_tag (string or iterable) – tags names to scrape songs from.
Returns: yield song tags corresponding to the album tag.
Return type: yield bs4.element.Tag
- Examples::
>>> # Import packages >>> import bs4 # for web scrapping >>> import urllib.request # to connect
>>> # Set Up: connect to a lyric wiki page >>> USER = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7' >>> HEADERS = {'User-Agent': USER} >>> URL = 'https://lyrics.fandom.com/wiki/London_Grammar' >>> req = urllib.request.Request(URL, headers=HEADERS) >>> page = urllib.request.urlopen(req) >>> soup = bs4.BeautifulSoup(page, 'lxml')
>>> # Scrape songs from the first album, 'Strong (2013)' EP. >>> album_h2_tag = soup.select('h2 .mw-headline')[0].parent >>> for song_tag in scrape_albums(album_h2_tag): ... print(song_tag.text) Strong Feelings
>>> # Scrape all songs from the artist page >>> for album_tag in scrape_albums(soup): >>> album_h2_tag = album_tag.parent >>> for song_tag in scrape_songs(album_h2_tag): >>> print(album_h2_tag.text) >>> print(song_tag.text) >>> print('------------') Strong (2013) Strong Feelings ------------ If You Wait (2013) Hey Now Stay Awake Shyer Wasting My Young Years Sights Strong etc. ...
lyricsfandom.utils¶
Utilities functions.
-
lyricsfandom.utils.
capitalize
(string_raw)[source]¶ Capitalize a string, even if it is between quotes like “, ‘.
Parameters: string_raw (string) – text to capitalize. Returns: string
-
lyricsfandom.utils.
name_to_wiki
(name)[source]¶ Process artist, album and song’s name.
Parameters: name – Returns:
-
lyricsfandom.utils.
name_to_wiki_id
(name)[source]¶ Generate a Lyric Wiki ID from a name.
Parameters: name (string) – name of an artist / song. Returns: string
-
lyricsfandom.utils.
parse_album_header
(album_header)[source]¶ Split the album title in half, to retrieve its name an year.
Examples:
>>> album_title = 'His Young Heart (2011)' >>> split_album_title(album_title) (His Young Heart, 2011)
Parameters: album_header (string) – album header / title to split Returns: album name and year. Return type: tuple
-
lyricsfandom.utils.
parse_song_title
(song_title, artist_name=None)[source]¶ Split a song title to retrieve the artist name and song name. Additional argument can be added to better retrieve these names.
Parameters: - song_title (string) – song header (or title for the
<a>
element) - artist_name (string, optional) – name of the artist.
Returns: tuple
- song_title (string) – song header (or title for the
-
lyricsfandom.utils.
process_lyrics
(lyrics)[source]¶ Process lyrics.
Parameters: lyrics (string) – lyrics to tokenize / modify. Returns: string
-
lyricsfandom.utils.
serialize_dict
(dict_raw)[source]¶ Serialize a dictionary in ASCII format so it can be saved as a JSON.
Parameters: dict_raw (dict) – Returns: dict
-
lyricsfandom.utils.
serialize_list
(list_raw)[source]¶ Serialize a list in ASCII format, so it can be saved as a JSON.
Parameters: list_raw (list) – Returns: list
-
lyricsfandom.utils.
split_header
(header)[source]¶ Split the header to get the artist name, album, and year.
Examples:
>>> album_title = 'Daughter:His Young Heart (2011)' >>> split_album_title(album_title) (Daughter, His Young Heart, 2011)
Parameters: header (string) – album header / title to split Returns: album name and year. Return type: tuple
-
lyricsfandom.utils.
split_song_header
(song_header)[source]¶ Split the song title in half, to retrieve its artist an name.
Examples:
>>> song_header = 'Daughter:Run Lyrics' >>> split_song_header(song_header) (Daughter, Run)
Parameters: song_header (string) – song header / title to split Returns: artist name and song name. Return type: tuple
lyricsfandom.music¶
lyricsfandom.music.artist¶
Defines an artist from LyricWiki
server.
Extract albums and songs from https://lyrics.fandom.com/Artist_Name
page.
- Examples::
>>> # Note that names are not case sensible >>> artist = Artist('daughter') >>> artist Artist: Daughter
>>> # Get all albums (compilation, covers etc. included) >>> artist.get_albums() [Daughter: EP "His Young Heart" (2011), Songs: 4, Daughter: EP "The Wild Youth" (2011), Songs: 4, Daughter: Album "If You Leave" (2013), Songs: 12, Daughter: Album "Not To Disappear" (2016), Songs: 11, Daughter: Album "Music From Before The Storm" (2017), Songs: 13, Daughter: "Songs On Compilations", Songs: 2, Daughter: Single "Other Songs", Songs: 1]
>>> # Only look for albums / singles released by the artist >>> artist.get_albums(cover=False, other=False) [Daughter: EP "His Young Heart" (2011), Songs: 4, Daughter: EP "The Wild Youth" (2011), Songs: 4, Daughter: Album "If You Leave" (2013), Songs: 12, Daughter: Album "Not To Disappear" (2016), Songs: 11, Daughter: Album "Music From Before The Storm" (2017), Songs: 13, Daughter: Single "Other Songs", Songs: 1]
>>> # Idem for get_songs()
>>> # Look for an album / song from the artist >>> song = artist.search_song('candles') >>> lyrics = song.get_lyrics() >>> print(lyrics) That boy, take me away, into the night Out of the hum of the street lights and into a forest I'll do whatever you say to me in the dark Scared I'll be torn apart by a wolf in mask of a familiar name on a birthday card
Blow out all the candles, blow out all the candles “You’re too old to be so shy,” he says to me so I stay the night Just a young heart confusing my mind, but we’re both in silence Wide-eyed, both in silence Wide-eyed, like we’re in a crime scene etc. …>>> # Retrieve the artist from a song / album object >>> song.get_artist() Artist: Daughter
>>> # Get additional information from the artist >>> artist.get_info() {'Years Active': '2010 - present', 'Band Members': ['Elena Tonra', 'Igor Haefeli', 'Remi Aguilella'], 'Genres': ['Indie Folk', 'Folk Rock'], 'Record Labels': ['4AD']}
>>> # Get merchandise links >>> artist.get_links() {'Amazon': ['https://www.amazon.com/exec/obidos/redirect?link_code=ur2&tag=wikia-20&camp=1789&creative=9325&path=https%3A%2F%2Fwww.amazon.com%2F-%2Fe%2FB001LHN42M'], 'iTunes': ['https://itunes.apple.com/us/artist/469701923'], 'AllMusic': ['https://www.allmusic.com/artist/mn0003013627'], 'Discogs': ['http://www.discogs.com/artist/2218596'], 'MusicBrainz': ['https://musicbrainz.org/artist/a1ced3e5-476c-4046-bd74-d428f419989b'], 'Spotify': ['https://open.spotify.com/artist/46CitWgnWrvF9t70C2p1Me'], 'Bandcamp': ['https://ohdaughter.bandcamp.com/']}
>>> # Convert the data to JSON >>> data = artist.to_json(encode='ascii', nested=False)
These are the most common functions, but others can be used to modify the data.
-
class
lyricsfandom.music.artist.
Artist
(artist_name)[source]¶ Defines an Artist / Band from
https://lyrics.fandom.com/wiki/
.artist_name
: name of the artist.base
: base page of Lyric Wiki.href
: href link of the artist.url
: url page of the artist.
-
add_album
(album, force=None)[source]¶ Add an album to the artist. When adding a new argument, the album artist’s name can be changed to match the parent artist, using
force=True
. If the provided album is the name of an album, it will automatically create an (empty) album and add it to the artist.Parameters: - album (Album or string) – album (or album name) to add to the current artist.
- force (bool) – if
True
, change the album’sartist_name
attribute to match the artist’s name.
- Examples::
>>> artist = Artist('daughter') >>> album = Album('daugghter', 'the wild youth') >>> artist.add_album(album) >>> artist.get_albums() [Daugghter: "The Wild Youth", Songs: 0]
>>> artist = Artist('daughter') >>> album = Album('daugghter', 'the wild youth') >>> artist.add_album(album, force=True) >>> artist.get_albums() [Daughter: "The Wild Youth", Songs: 0]
-
classmethod
from_url
(url)[source]¶ Construct an Artist from an url.
Parameters: url (string) – url. Returns: Artist - Examples::
>>> artist = Artist.from_url('https://lyrics.fandom.com/wiki/Daughter') >>> artist Artist: Daughter
-
get_albums
(cover=False, other=False)[source]¶ Get a list of all albums made by the artist. Keywords arguments can be provided to scrape only from released albums, and reject covers, remix, compilation etc.
Parameters: - cover (bool) – if
True
scrape featuring or covers songs. - other (bool) – if
True
scrape remixes or compilation albums.
Returns: list
- cover (bool) – if
-
get_info
()[source]¶ Retrieve additional information of an Artist (like band members, labels, genres etc.).
Returns: dict - Examples::
>>> artist = Artist('Daughter') >>> artist.get_info() {'Years Active': '2010 - present', 'Band Members': ['Elena Tonra', 'Igor Haefeli', 'Remi Aguilella'], 'Genres': ['Indie Folk', 'Folk Rock'], 'Record Labels': ['4AD']}
-
get_songs
(cover=False, other=False)[source]¶ Get a list of all songs made by the artist. Keywords arguments can be provided to scrape only from songs made by the artist, and reject covers etc.
Parameters: - cover (bool) – if
True
scrape featuring or covers songs. - other (bool) – if
True
scrape remixes or compilation albums.
Returns: list
- cover (bool) – if
-
items
(cover=True, other=True)[source]¶ Connect to
LyricWiki
server and scrape albums / songs. Keywords arguments can be provided to scrape only from released albums, and reject covers, remix, compilation etc.Parameters: - cover (bool) – if
True
scrape featuring or covers songs. - other (bool) – if
True
scrape remixes or compilation albums.
Returns: yield Album
- cover (bool) – if
-
search_album
(album_name)[source]¶ Search an album from an artist’s discography.
Parameters: album_name (string) – name of the album to look for. Returns: Album
lyricsfandom.music.album¶
Extract lyrics and songs from https://lyrics.fandom.com/
website.
Examples
# 1. Generate an album from scratch
album = Album('Bon Iver', 'For Emma, Forever Ago')
# Scrape songs.
songs = album.get_songs()
# Be careful as this album was created from scratch it is not linked to any ``Artist`` instance.
# However, there is still the artist's name saved.
album.get_artist() # None
album.artist_name # 'Bon Iver'
# 2. Use an album from an artist
artist = Artist('Bon Iver')
album = Album.from_artist(artist, 'For Emmma, Forever Ago')
album.get_artist() # Artist: 'Bon Iver'
# Or search it from the artist class.
album = artist.search_album('For Emma, Forever Ago')
-
class
lyricsfandom.music.album.
Album
(artist_name, album_name, album_type=None, album_year=None)[source]¶ Defines an Album from
https://lyrics.fandom.com/wiki/
.album_name
: album of the artist.album_type
: type of album.album_year
: released of the album.songs
: songs of the album.
-
add_song
(song, force=None)[source]¶ Add a song to the album. When adding, the song artist’s name / album names can be changed to match the parent album, using
force=True
. If the provided song is the name of a song (a string), it will automatically create an (empty) song and add it to the album.Parameters: - song (Song or string) – song (or song name) to add to the current album.
- force (bool) – if
True
, change the song’sartist_name, album_name, album_year, album_type
attribute to match its parent.
- Examples::
>>> album = Album('daughter', 'the wild youth') >>> song = Song('daughter', 'youth') >>> album.add_song(song) >>> artist.get_albums() >>> album Daughter: "The Wild Youth", Songs: 5
-
classmethod
from_artist
(artist, album_name)[source]¶ Construct an Album from an Artist.
Parameters: - artist (Artist) – Artist to extract the album from.
- album_name (string) – name of the album.
Returns: Album
-
classmethod
from_url
(url)[source]¶ Construct an Album from an url.
Parameters: url (string) – url. Returns: Album - Examples::
>>> album = Album.from_url('https://lyrics.fandom.com/wiki/Daughter:His_Young_Heart_(2011)') >>> album
-
search_song
(song_name)[source]¶ Search a song from an album’s playlist.
Parameters: song_name (string) – name of the song to look for Returns: Song
lyricsfandom.music.song¶
Extract lyrics and songs from https://lyrics.fandom.com/
website.
-
class
lyricsfandom.music.song.
Song
(artist_name, song_name, album_name=None, album_type=None, album_year=None)[source]¶ Defines a Song from
https://lyrics.fandom.com/
.song_name
: name of the song.url
: url of the song.
-
classmethod
from_album
(album, song_name)[source]¶ Construct a Song from an Album.
Parameters: - album (Album) – album to extract the song from.
- song_name (string) – name of the song.
Returns: Song
-
classmethod
from_artist
(artist, song_name)[source]¶ Construct a Song from an artist.
Parameters: - artist (Artist) – artist to extract the song from.
- song_name (string) – name of the song.
Returns: Song