Nettskjemar connects to version 3 of the nettskjema api, and the main functionality here is to download data from a form into R. Once you have created a nettskjema client, and set up your Renvironment locally, you can start accessing your forms.
While functions to download data also have the option to turn off the codebook, i.e. return the data with the original questions as column names, this is not recommended. Working with data in R in this format is very unpredictable, and we cannot guarantee that the functions in this package will act as expected.
Therefore, you are highly advised if you are using this package, to turn on the codebook in the Nettskjema portal for your form, and setting up a codebook for the entire form.
You can toggle the codebook for a form by going to the Nettskjema portal and entering your form. Then proceed to “Settings” and then “General settings”, and make sure “Codebook activated” is set to “Yes”. Once this is toggled, you will need to setup the codebook, either manually (advised) or by using the pre-filling functionality in Nettskjema.
You can read more about the details around the codebook on the UiO webpages (only available in Norwegian).
The data returns in this package are developed to be tidyverse-compatible. This means that those who are familiar with tidyverse, should find working with the data as retrieved from this package fairly easy. If you want to learn about the tidyverse and how to use it, there are excellent resources for that on the Tidyverse webpage.
Perhaps at the core of this package is the ability to download submission answers to a form into a tibble (variation of a data.frame).
library(nettskjemar)
formid <- 123823
data <- ns_get_data(formid)
data
#> formid $submission_id
#> 1 123823 27685292
#> $created freetext radio
#> 1 2023-06-01T20:57:15+02:00 some text 1
#> checkbox.questionnaires checkbox.events
#> 1 1 1
#> checkbox.logs dropdown radio_matrix.grants
#> 1 0 4 1
#> radio_matrix.lecture radio_matrix.email
#> 1 2 2
#> checkbox_matrix.1.IT
#> 1 1
#> checkbox_matrix.1.colleague
#> 1 1
#> checkbox_matrix.1.admin
#> 1 0
#> checkbox_matrix.1.union
#> 1 0
#> checkbox_matrix.1.internet
#> 1 0
#> checkbox_matrix.2.IT
#> 1 0
#> checkbox_matrix.2.colleague
#> 1 0
#> checkbox_matrix.2.admin
#> 1 1
#> checkbox_matrix.2.union
#> 1 0
#> checkbox_matrix.2.internet date time
#> 1 0 2023-06-01 12:00
#> datetime number_decimal
#> 1 2023-06-12T13:33 4.5
#> number_integer slider attachment_1
#> 1 77 3 sølvi.png
#> attachment_2 $answer_time_ms
#> 1 74630
#> [ reached 'max' / getOption("max.print") -- omitted 2 rows ]
You will notice three columns that are prefixed with $
.
These are columns automatically added by the Nettskjema backend to your
data, and the prefix is used to denote exactly this.
If you’d like to have labelled data, similar to that of SPSS and Stata, have a look at the vignette on labelled data for more information about that.
Raw answers come in a very different format than the data as seen in the codebook and preview in Nettskjema portal. The raw data have a timestamped format, per question for each submission. This means the data comes in a tall format (many rows per submission) rather than a wide format (one row per submission). The raw data may provide those who have keen interest in the time a users spent between each question. The data.frame this output will have one column that indicates which question the row is for, and another which is the response to that question.
# Fetch raw data
ns_get_data(formid, type = "long")
#> formid formId submissionId answerId
#> 1 123823 123823 27685292 158263133
#> 2 123823 123823 27685292 158263124
#> elementId externalElementId textAnswer
#> 1 1641697 freetext some text
#> 2 1641698 radio <NA>
#> answerOptionIds externalAnswerOptionIds
#> 1
#> 2 3879435 1
#> elementType createdDate
#> 1 QUESTION 2023-06-01T20:57:15
#> 2 RADIO 2023-06-01T20:57:15
#> modifiedDate subElementId
#> 1 2023-06-01T20:57:15 NA
#> 2 2023-06-01T20:57:15 NA
#> answerAttachmentId filename mediaType size
#> 1 NA <NA> <NA> NA
#> 2 NA <NA> <NA> NA
#> attachment.answerAttachmentId
#> 1 NA
#> 2 NA
#> attachment.filename attachment.mediaType
#> 1 <NA> <NA>
#> 2 <NA> <NA>
#> attachment.size
#> 1 NA
#> 2 NA
#> [ reached 'max' / getOption("max.print") -- omitted 44 rows ]
Raw data cannot be labelled, and no other alterations have been made to the data. They come exactly as the API returns them, so you may do what you need with them.
The Nettskjema survey tool includes the possibility to create a matrix of checkboxes, i.e. giving the respondents the ability to select several options within a question. Checkboxes are returned as binary columns, one columns per checkbox where 1 indicates that the checkbox was ticked.
We provide some extra functionality to help work with these data. Currently we only support working with the checkbox matrix questions, but are working on a solution for all checkboxes.
ns_get_data(formid) |>
ns_alter_checkbox(
to = "list"
)
#> $submission_id formid
#> 1 27685292 123823
#> 2 27685302 123823
#> $created freetext
#> 1 2023-06-01T20:57:15+02:00 some text
#> 2 2023-06-01T20:58:33+02:00 another answer
#> radio checkbox.questionnaires checkbox.events
#> 1 1 1 1
#> 2 -1 0 0
#> checkbox.logs dropdown radio_matrix.grants
#> 1 0 4 1
#> 2 1 9 3
#> radio_matrix.lecture radio_matrix.email
#> 1 2 2
#> 2 3 1
#> date time datetime
#> 1 2023-06-01 12:00 2023-06-12T13:33
#> 2 2023-02-07 14:45 2024-02-15T08:55
#> number_decimal number_integer slider
#> 1 4.5 77 3
#> 2 2.2 45 1
#> attachment_1 attachment_2 $answer_time_ms
#> 1 sølvi.png 74630
#> 2 marius.jpeg 71313
#> checkbox_matrix.1 checkbox_matrix.2
#> 1 colleague, IT admin
#> 2 internet admin, union
#> [ reached 'max' / getOption("max.print") -- omitted 1 rows ]
These can be separated into rows if wanted, using tidyverse syntax.
library(tidyr)
library(dplyr)
# As list column
ns_get_data(formid) |>
ns_alter_checkbox(
to = "list"
) |>
relocate(checkbox_matrix.1, .after = 2)
#> $submission_id formid checkbox_matrix.1
#> 1 27685292 123823 colleague, IT
#> 2 27685302 123823 internet
#> $created freetext
#> 1 2023-06-01T20:57:15+02:00 some text
#> 2 2023-06-01T20:58:33+02:00 another answer
#> radio checkbox.questionnaires checkbox.events
#> 1 1 1 1
#> 2 -1 0 0
#> checkbox.logs dropdown radio_matrix.grants
#> 1 0 4 1
#> 2 1 9 3
#> radio_matrix.lecture radio_matrix.email
#> 1 2 2
#> 2 3 1
#> date time datetime
#> 1 2023-06-01 12:00 2023-06-12T13:33
#> 2 2023-02-07 14:45 2024-02-15T08:55
#> number_decimal number_integer slider
#> 1 4.5 77 3
#> 2 2.2 45 1
#> attachment_1 attachment_2 $answer_time_ms
#> 1 sølvi.png 74630
#> 2 marius.jpeg 71313
#> checkbox_matrix.2
#> 1 admin
#> 2 admin, union
#> [ reached 'max' / getOption("max.print") -- omitted 1 rows ]
# Turns list column, into rows
ns_get_data(formid) |>
ns_alter_checkbox(
to = "list"
) |>
unnest(checkbox_matrix.1) |>
relocate(checkbox_matrix.1, .after = 2)
#> # A tibble: 5 × 23
#> `$submission_id` formid checkbox_matrix.1
#> <int> <dbl> <chr>
#> 1 27685292 123823 colleague
#> 2 27685292 123823 IT
#> 3 27685302 123823 internet
#> 4 27685319 123823 colleague
#> 5 27685319 123823 internet
#> # ℹ 20 more variables: `$created` <chr>,
#> # freetext <chr>, radio <int>,
#> # checkbox.questionnaires <int>,
#> # checkbox.events <int>, checkbox.logs <int>,
#> # dropdown <int>, radio_matrix.grants <int>,
#> # radio_matrix.lecture <int>,
#> # radio_matrix.email <int>, date <chr>, …
If you are not used to working with list columns, another option is to work with delimited strings in columns. In this procedure, you can turn the checkboxes into a character column, where each option in separated by a character of your choosing.
# As delimited string column
ns_get_data(formid) |>
ns_alter_checkbox(
to = "character",
sep = ";"
) |>
relocate(checkbox_matrix.1, .after = 2)
#> $submission_id formid checkbox_matrix.1
#> 1 27685292 123823 colleague;IT
#> 2 27685302 123823 internet
#> $created freetext
#> 1 2023-06-01T20:57:15+02:00 some text
#> 2 2023-06-01T20:58:33+02:00 another answer
#> radio checkbox.questionnaires checkbox.events
#> 1 1 1 1
#> 2 -1 0 0
#> checkbox.logs dropdown radio_matrix.grants
#> 1 0 4 1
#> 2 1 9 3
#> radio_matrix.lecture radio_matrix.email
#> 1 2 2
#> 2 3 1
#> date time datetime
#> 1 2023-06-01 12:00 2023-06-12T13:33
#> 2 2023-02-07 14:45 2024-02-15T08:55
#> number_decimal number_integer slider
#> 1 4.5 77 3
#> 2 2.2 45 1
#> attachment_1 attachment_2 $answer_time_ms
#> 1 sølvi.png 74630
#> 2 marius.jpeg 71313
#> checkbox_matrix.2
#> 1 admin
#> 2 admin;union
#> [ reached 'max' / getOption("max.print") -- omitted 1 rows ]
# Turns string column, into rows
ns_get_data(formid) |>
ns_alter_checkbox(
to = "character",
sep = ";"
) |>
relocate(checkbox_matrix.1, .after = 2) |>
separate_rows(checkbox_matrix.1)
#> # A tibble: 5 × 23
#> `$submission_id` formid checkbox_matrix.1
#> <int> <dbl> <chr>
#> 1 27685292 123823 colleague
#> 2 27685292 123823 IT
#> 3 27685302 123823 internet
#> 4 27685319 123823 colleague
#> 5 27685319 123823 internet
#> # ℹ 20 more variables: `$created` <chr>,
#> # freetext <chr>, radio <int>,
#> # checkbox.questionnaires <int>,
#> # checkbox.events <int>, checkbox.logs <int>,
#> # dropdown <int>, radio_matrix.grants <int>,
#> # radio_matrix.lecture <int>,
#> # radio_matrix.email <int>, date <chr>, …