Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

https://docs.google.com/spreadsheets/d/1c6A356Eg9zDX3dkYGGvDJuzfpmSgHnp-Zf7ucSodYRA/edit?usp=sharing


...

Sind Probiert folgende Auswertungen / Applikationen mit dem Datensatz möglich?

...

mit den Daten anzustellen?

Welche Schritte braucht es dafür?

Welche Abstriche muss man bei der Auswertung machen?


  1. Durchschnittliche Vertragslänge (von Inbetriebnahme bis Vertragsende)
  2. Gesamtstromproduktion nach Kanton
  3. Gesamtsumme der Geldbezüge für einerseits Herren, andererseits Frauen
  4. Durchschnittliche Länge der Strassennamen nach Kanton
  5. Karte mit Standorten der KEV-Bezüger (= ein Punkt-Symbol für jeden genauen Standort, ähnlich wie in diesem Artikel)

...

Exercise solutions day 1

  1. Manillio:
    1. Sometimes, data is accessible via an API
    2. The preferred data format of APIs is JSON
    3. JSON can be converted into CSV
    4. The preferred way of talking to an API is with code
    1. Browse https://developer.spotify.com/console/get-search-item/ to get his ID: 7uxtLjuqkJ3cnjQQuW6Cul
    2. Browse https://developer.spotify.com/console/get-artist-top-tracks/, fill in values and get JSON data
    3. Copy and paste into https://json-csv.com/
    4. Download as Excel - do the math (36.60705 min)
    5. Take-Aways:
    6. Sometimes, data is accessible via an API
    7. The preferred data format of APIs is JSON
    8. JSON can be converted into CSV
    9. The preferred way of talking to an API is with code
  2. Wasserstation Tiefenbrunnen
  3. First approach: Scraping data
    1. Browse https://www.tecson-data.ch/zurich/tiefenbrunnen/index.php (as probably shown on Google)
    2. Select “windchill”, 2.11.2018/7.11.2018 and “all values” at the very bottom
    3. Copy stuff into Excel by hand and calculate median
  4. Second approach: Open Data Zürich / API:
    1. Browse https://tecdottir.herokuapp.com/docs/#/measurements
    2. Enter parameters
    3. Copy curl string and pipe into a file
    4. Upload JSON and paste into  https://json-csv.com/ (bonus: use matrix style)
    5. Download CSV, open in Excel and calculate median (don’t forget to filter unneeded dates)
  5. Take-Aways:
    1. Copying and pasting stuff from HTML tables should be avoided
    2. Always look out for an API
    3. Try out different settings of your tools - they might bring you better results (“matrix style”)
    4. Get to know the terminal
    5. Excel / LibreOffice / OpenOffice have some good filters: get to know how to use them
    6. If you run out of queries, delete cookies
    1. First approach: Scraping data
    2. Second approach: Open Data Zürich / API:
    3. Take-Aways:
  6. Schlichtungsverfahren
    1. Many interesting data are buried in PDFs
    2. Use proprietary software or Tabula to extract the data
    1. Google it and go to https://www.bwo.admin.ch/bwo/de/home/mietrecht/schlichtungsbehoerden/statistik-der-schlichtungsverfahren.html
    2. Download first PDF
    3. Download Tabula and launch, upload PDF (or use Adobe Reader DC)
    4. Select last table, lattice extraction format
    5. Download as CSV
    6. Open in LibreOffice and make chart
    7. Take-Aways:
    8. Many interesting data are buried in PDFs
    9. Use proprietary software or Tabula to extract the data