Introducing czapi
What
czapi is a basic Python library used to scrape curling boxscores off of the popular CurlingZone website.
Why
For the past couple years, I’ve tried, and failed, to write a nice tool for scraping boxscores off of CurlingZone which can be used to develop homegrown analytics and this is the latest iteration.
How
czapi can be installed using the extremely popular and common packgage manager, pip.
pip install czapi
After installation, the api can be imported into any Python environment.
import czapi.api as api
At the moment czapi supports scraping curling boxscores off of CurlingZone in two ways:
- Using the CurlingZone game id.
- Using the CurlingZone event id, draw id and game number.
Option 1
By navigating to any linescore page, like the one seen below and clicking on the ‘View Game Details’ link:
You’ll be brought to a game details page where in the URL, you’ll notice a game id (298613 in this case).
That game id can be used to scrape the boxscore details.
api.get_full_boxscore(cz_game_id = 298613)
{
'Raphaela Keiser':
{
'href' : 'event.php?view=Team&eventid=6983&teamid=159142&profileid=30327#1'
,'hammer': False
,'score': ['0', '1', '0', '0', '1', '0', '1', '1', '0', '1']
,'finalscore': '5'
,'date': 'Feb 19 - 22, 2022'
,'event': 'Swiss Curling Championships'
,'hash': 'c4f6d572fb32f960faa0e270e20760f56646ba91e6f38413e015707837c8c396'}
,'Silvana Tirinzoni':
{
'href': 'event.php?view=Team&eventid=6983&teamid=159140&profileid=30815#1'
,'hammer': True
,'score': ['0', '0', '1', '0', '0', '1', '0', '0', '1', '0']
,'finalscore': '3'
,'date': 'Feb 19 - 22, 2022'
,'event': 'Swiss Curling Championships'
,'hash': 'c4f6d572fb32f960faa0e270e20760f56646ba91e6f38413e015707837c8c396'}
}
Option 2
Instead of navigating to a game details page, the boxscore can be scraped right from the linescore page using the CurlingZone event id, draw id and game number. The image below highlights the CurlingZone event and draw id (6983 and 20 respectively in this case).
The game number is the sequential order the games appear on the page, so, in this case, the Keiser / Tirinzoni game would be game number 1 and the Huerlimann / Schwaller gamne would be game number 2.
Here’s an example if i wanted to grab the Huerlimann / Schwaller game.
api.get_full_boxscore(cz_event_id = 6983, cz_draw_id = 20, game_number = 2)
{
'Corrie Huerlimann':
{
'href': 'event.php?view=Team&eventid=6983&teamid=159139&profileid=30260#1'
,'hammer': True
,'score': ['0', '0', '0', '1', '1', '0', '0', '0', '0', 'X']
,'finalscore': '2'
,'date': 'Feb 19 - 22, 2022'
,'event': 'Swiss Curling Championships'
,'hash': 'ab12b14f1a78484f1afb6106ae681b645deab0dda3eb7a90f02e4b5702f655e9'}
,'Xenia Schwaller':
{
'href': 'event.php?view=Team&eventid=6983&teamid=159137&profileid=30815#1'
,'hammer': False
,'score': ['0', '2', '1', '0', '0', '1', '1', '1', '2', 'X']
,'finalscore': '8'
,'date': 'Feb 19 - 22, 2022'
,'event': 'Swiss Curling Championships'
,'hash': 'ab12b14f1a78484f1afb6106ae681b645deab0dda3eb7a90f02e4b5702f655e9'}
}
A boxscore is returned as a dictionary of team names, game information dictionary key, value pairs.
Each game information dictionary contains:
- ‘href’ key with a corresponding string value of CurlingZone IDs identifying the team.
- ‘hammer’ key with corresponding boolean value of whether or not that team started the game with hammer.
- ‘score’ key with corresponding list of string value of end-by-end results for that team.
- ‘date’ key with corresponding string value of the date of the event.
- ‘event’ key with corresponding string value of the name of the event.
- ‘hash’ key with corresponding SHA256 hash of the entire boxscore.
Where
What’s next
If you’re interested in helping out in the development of the API, please reach out at robert.art.currie@gmail.com.