Data Preview Portal Questions

What is the best browser to use to explore the DataPreview Portal?

The DataPreview Portal works best with Chrome and Firefox. Mac OS users can also use Safari. Internet Explorer appears to have limited functionality, and is currently not recommended.

How can the variables list be searched or filtered?

From the Variables search, accessed through the DataPreview Portal, full-text searches can be carried out on either the variable Name or Label. The variables can also be filtered by classification through the clickable boxes on the left. A Help button is available on the DataPreview Portal page, next to the main search bar, with detailed instructions. To see all of the questionnaires used by the CLSA, you can visit the Data Collection Tools under the Researchers tab.


What variables are included in each questionnaire?

To obtain all the variables contained in a questionnaire, type the two or three letter prefix (e.g. SDC for Socio-demographic Variables) into the full-text search box in the Variables listing, under “Variable properties > Name”. You can also use more general terms such as ‘food’, ‘work’, etc. (under “Variable properties > Label”) to find variables related to those terms, however, search terms are not exhaustive. For more information on the variables included in a questionnaire, please visit the Data Collection Tools under the Researchers tab.

How are multiple choice questions represented as variables in the CLSA dataset?

Multiple-choice questions are represented by either a single variable or multiple variables, depending on what the question allows:

- A question allowing only one response is represented by a single variable that can take on multiple values. Open-text responses are permitted in many questions; common and distinct responses are recoded to create new categories within the variable itself.

-  For a question allowing multiple responses, each possible response category is assigned its own binary variable. Open-text responses are also permitted in many of these questions; common and distinct options are also recoded to create additional variables within the question scope. The number of variables corresponding to that question matches the number of response options. 

Where can I find information about response categories for multiple choice questions?

In the variable view, clicking on the name of each variable reveals information that is more detailed. For example, under Categories, the variable information page will include the following information:

•           Name:  the value entered for a response in the questionnaire;

•           Label:  the response (or response category) corresponding to each value (Name);

•           Missing:  values corresponding to a question not answered, (don’t know or not applicable, refused).

Where can I find supplementary information about each of the variables?

Clicking on the variable name, in the variable view reveals more detailed information about that variable. (This function is not available for all study variables.) This information includes the question pertinent to the variable, the variable label, a list of the response option categories and some automatically generated summary statistics. In some instances, additional notes on skip patterns or references are included as well.

Data Access Questions

How do I get access to the data?

Requests to access the CLSA data are reviewed by the Data and Sample Access Committee (DSAC). Please consult the Data and Sample Access Policy and Guiding Principles, the pertinent sections of the CLSA protocol(s), the CLSA Data Collection Tools, and the information on the Data Access Application Process and timelines, before preparing an application. The steps involved in the data access process are also outlined here.

When are the data access application deadlines?

There are three data access application deadlines per year. For upcoming dates, please see the Application Deadlines page of our website under the Data Access section.

Do I need an institutional email address to access CLSA data?

Yes, approved users requiring access to CLSA data must use the email address of the institution with which they are affiliated. Data will only be released to institutional email addresses. Email addresses containing domain names such as Gmail, Hotmail, etc. are not acceptable.

How are alphanumeric data released to users?

The Data and Sample Access Committee (DSAC) reviews applications for the use of CLSA data and biospecimens and makes a recommendation to the CLSA Scientific Management Team (SMT). Once the project has been approved by the SMT, the CLSA Access Agreement has been signed, and proof of ethics approval has been received by the CLSA, the Statistical Analysis Centre will provide a download link for the dataset to the primary applicant. The link is valid for seven days and the number of downloads is determined by the number of project team members who have signed Schedule F of the CLSA Access Agreement, indicating that they require direct access to the data. It is the primary applicant’s responsibility to share the download link with the project team members who have signed Schedule F, and to ensure that all of the project team members respect the terms of the signed CLSA Access Agreement. Please refer to the CLSA Access Agreement for more information on the responsibilities of users.

Can I share the data?

Strict CLSA security and confidentiality rules are in place governing the use and sharing of CLSA data. The CLSA requires users to sign a CLSA Access Agreement that details the specific uses of data and the CLSA’s expectations with regard to privacy and confidentiality. Only the primary applicant and the project team members who have signed Schedule F of the CLSA Access Agreement or an approved Amendment are allowed to have direct access to the raw data. No approved user or member of their research team is allowed to share in whole, or in part the CLSA dataset with individuals who have not signed Schedule F of the CLSA Access Agreement, or an approved Amendment.

Can I add an investigator or a student to the project team, so that they may have access the data?

Yes, you can add personnel to your study while your CLSA Access Agreement is valid. Please request an Amendment Form by sending an email to access@clsa-elcv.ca, noting ‘Amendment Form Request’ in the subject line of your email. Once completed, please return the form to access@clsa-elcv.ca for review. Only once your amendment has been approved, and the CLSA Access Agreement has been amended (if required), can you allow the new person(s) access to the dataset and/or biospecimens.

What is the format of the alphanumeric dataset when released?

Data are provided to researchers in a comma-separated values (.csv) file. Please note that the Baseline CLSA alphanumeric dataset contains over 4000 variables collected from more than 51,000 participants. Depending on your choice of statistical software and proposed analyses, automatic data imports may not succeed and you may need to instruct your software how to read the file. This may require the use of advanced scripting and/or macros in some cases. The CLSA encourages you to include someone experienced in working with such complex datasets on your project team.

How long do I have to use and analyse the data once I receive them?

Researchers who have received data will have a specified time within which the proposed analyses must be completed. This timeframe is defined in the CLSA Access Agreement. If the analyses are not completed in this period, the applicant must either submit a request for a time extension or their data access agreement may be terminated (CLSA Access Agreement, Section 13). The CLSA will monitor the approved applications for adherence to the timeline. To make a request for a timeline extension, please request an Amendment Form by sending an email to access@clsa-elcv.ca, noting ‘Amendment Form Request’ in the subject line of your email.

How can I request biospecimens?

The anticipated release date for biospecimens is 2019. There will be one application deadline to submit biospecimen access requests per year. The application deadline is yet to be determined and will be posted on our website. For further information, please consult the Biospecimen Access Guidelines. For questions related to biospecimen access, please contact the Biorepository and Bioanalysis Centre (BBC) at bbc@clsa-elcv.ca.

Can data access be expedited?

Data access cannot be expedited; all interested researchers are required to follow the same application procedures to gain access to the CLSA dataset. Please see the Data Access Application Process page and the Application Deadlines section of our website for further information.

Where can I find information on projects using the CLSA dataset?

Approved Project Summaries are posted under the Researchers Tab of our website.

Where can I find publications about and using the CLSA dataset?

Publications about the CLSA as well as those using CLSA data, can be found under Publications, in the Stay Informed section of our website.

Publications & Presentation Questions

Does the CLSA need to review my publications and presentations?

Final drafts of all manuscripts describing research using CLSA data and/or biospecimens must be sent to the CLSA for review at least 15 working days prior to anticipated submission to the journal. Abstracts, posters and presentations do not need to be submitted for review, but should include appropriate acknowledgements. Please review our Publication and Promotion Policy for CLSA Data Users for additional information.

Can I publish multiple peer-reviewed manuscripts based on my approved CLSA project?

As a publicly funded research platform, the CLSA encourages the dissemination of research findings from approved projects. The CLSA expects users to publish their findings in peer-reviewed journals. Multiple publications may be prepared based on a single approved project as long as the publications are directly linked to the objectives of the approved project.

Where do I find CLSA’s data availability statement?

Please consult Section 2.5 of our Publication and Promotion Policy for CLSA Data Users for our data availability statement.

Dataset Questions

How many participants are part of the CLSA at Baseline?

At Baseline, 21,241 participants were enrolled in the Tracking cohort and 30,097 participants in the Comprehensive cohort for a total of 51,338 CLSA participants.

The 30-minute Maintaining Contact Questionnaire (MCQ) interviews with additional health-related questions were completed approximately 18 months after the initial Baseline data collection. In total, 19,052 Tracking participants and 28,789 Comprehensive participants completed the MCQ. The indicator variable ADM_COMPLETE_MCQ is included in the dataset to indicate those participants who completed the MCQ.

What were the Baseline exclusion criteria?

Please refer to Section 5.3 of the CLSA Protocol, available under the Researchers section of our website.

When were Baseline data collected?

Periods of data collection for the Baseline assessments were as follows:

Baseline Tracking: 2011-09 to 2014-05

Baseline Comprehensive: 2011-12 to 2015-07

Maintaining Contact Questionnaire (MCQ) Tracking: 2013-09 to 2016-02

Maintaining Contact Questionnaire (MCQ) Comprehensive: 2014-05 to 2016-01

When were Follow-Up 1 data collected?

Periods of data collection for Follow-Up 1 assessments were as follows:

Follow-up 1 Tracking (telephone): 2014-05 to 2018-12

Follow-up 1 Comprehensive (in home and data collection site): 2015-07 to 2018-12

Does the CLSA provide guidance on how to analyse my data?

No, it is not within the purview of the CLSA to advise approved users on statistical analyses for approved projects. Data Support Documentation is available under the Researchers tab of our website, including a detailed document on the use of Sampling Weights. For further help, please consult with a statistician.

Are bootstrap weights available for the analyses of CLSA data?

No, the CLSA does not have bootstrap weights for the dataset, and we are not planning to produce bootstrap weights in the near future.

Will the CLSA dataset be linked to provincial health administrative databases across Canada? When will these data be available?

CLSA is working centrally on strategies to link individual level CLSA data with data from health administrative databases across Canada. Please continue to monitor the website for updates.

Can I link the CLSA dataset to other third party data holdings that I have access to?

Linking of the CLSA data to third party data holdings by an approved user is prohibited. Any proposals for linkage must be approved by the CLSA Scientific Management Team, and executed internally by the CLSA. Six-digit postal codes or HIN data are never released to users.

Are CLSA data available in Research Data Centres (RDC)?

No, currently, CLSA data are only available through a direct application to the CLSA. For more information on how to apply, please consult the Data Access Application Process section of our website.

What do blank values for a variable represent?

In general, variables in the CLSA dataset reflect the interview process. In some cases, follow-up questions were only asked if specific answers were given to preceding questions. Blank values in the Baseline data represent valid skip patterns. For example, number of daughters and sons are only asked if the participant answered, they have at least one child. In the CLSA dataset, participants with no children will have blank values for both.

What are derived variables?

Within the CLSA dataset, derived variables (DVs) are variables that are created from other variables. DVs are derived by re-grouping or re-classifying the original variables, to glean information otherwise not available. Some DVs are based on published measures or scales. You will find documentation related to DVs on our Data Support Documentation page, under the Researchers tab of our website.

How is participant death captured in the CLSA?

Participant death is currently captured in three ways: 1) from the next of kin contacting the CLSA directly, 2) through contact with the participants between main waves of data collection, or 3) from linkage to provincial vital statistics. Mortality data are not yet available.

What if there appears to be an error or omission in the data that I receive?

The CLSA takes great care to check the accuracy and completeness of the data prior to release. However, because of the size of the dataset and the large number of variables, we cannot guarantee the accuracy, completeness, or fitness for any particular purpose of the data. It is the responsibility of each data user to verify their dataset, the accompanying data dictionaries and the Data Support Documentation available on our website. If you think your data are incomplete or if you identify errors while conducting your analyses, please contact us at access@clsa-elcv.ca.

Occasionally, there may be a change in the data after you have already received your dataset. If this occurs, we will send a Data Release Update to all approved users, explaining the change(s). You will be able to request the updated dataset if relevant to your study.

Application Questions

How long will it take to receive my dataset?

Once you submit your CLSA Data and Biospecimen Request Application, you can track the progress of your application through the online application system. You may be contacted if additional information is required. You will be notified about the approval status of your application approximately three months after the submission deadline. If your application is approved, a CLSA Access Agreement must be negotiated and signed between McMaster University and your institution. This part of the process can take a variable length of time (up to an additional three months) and is not under the control of the CLSA. You will also need to provide evidence of ethics approval for your project, if you had not done so within your initial application. Please be aware that these steps may affect the length of time that it takes for the data to be released to you. Once all parties have signed the CLSA Access Agreement and proof of ethics approval has been received by the CLSA, your data will be released within 7 – 10 working days. When planning for your project, you must include in your timeframe at least six (6) months from the application submission deadline to the time you receive your dataset.

Can my application be rejected?

The goal of the CLSA is to enable data access to the platform. There may be some instances when an applicant is asked to revise and resubmit an application at the recommendation of the Data and Sample Access Committee (DSAC). To avoid applications being sent to the DSAC that are not appropriate, we try to work with interested researchers before the application is submitted, to provide them with information about the available data and feasibility of a project. During the application process itself, we ask applicants to correct errors and omissions and provide feedback from the DSAC review, so that applicants can clarify or revise and resubmit a proposal.

What do I do if I have problems completing the online application?

Should you encounter any issues completing the online application, please contact us via access@clsa-elcv.ca.

Who can apply to use CLSA data?

The CLSA data are currently available to approved public sector researchers, with no preferential or exclusive access for any individual. The CLSA welcomes applications from graduate students and postdoctoral fellows who wish to use data for their thesis research or for their postdoctoral work, respectively. For trainee applications (MSc, PhD, Postdoctoral fellows), the primary applicant must be the supervisor and the trainee must be clearly identified.

As an international researcher, can I apply to access CLSA data?

Yes, investigators affiliated with public research organizations outside Canada can apply to access alphanumeric data collected as part of the CLSA.

Currently there is no provision to transfer biospecimens to applicants outside of Canada, however, international researchers may choose to collaborate with Canadian researchers to access biospecimens, as long as the biospecimens are analysed in Canada.

How can I determine if my research project is feasible?

The CLSA DataPreview Portal has been designed to help researchers browse available variables in the CLSA Baseline dataset and find basic frequencies. Should you need additional information not available through the Portal to determine the feasibility of your proposal, please contact us on access@clsa-elcv.ca. We can provide simple cross-tabulations (of two or three variables), which are not available through our DataPreview Portal. We do our utmost to respond to data queries to help potential users ensure that their proposal is feasible, however, we are not resourced to provide more in-depth statistical support.

Will my proposal undergo a scientific review?

Evidence of peer-reviewed funding will be considered evidence of scientific review for data access applications. You must provide proof of the peer-reviewed funding for the specific project in your application. If there are no plans to submit an application for financial support for your project, or if it is a trainee project, please provide evidence of peer review (e.g. internal departmental review; thesis protocol defense, etc.) if available. Awards of funding not specific to the proposed project (i.e. student fellowships) are not considered proof of peer review. If no evidence of scientific peer review is provided with the application, the project will undergo scientific review by the Data and Sample Access Committee.

Does my project need to have secured funding before I apply?

No, you do not need to secure funding before applying to request CLSA data. If funding has been requested, but not yet approved, please provide the name of the funding agency in the appropriate section of the CLSA Data and Biospecimen Request Application. If your project is approved, and once the CLSA Access Agreement has been signed, the CLSA Financial Administrator will contact the primary applicant with the invoice for payment of the fees.

Do I have to obtain ethics approval for my project?

Yes. Please note that ethics approval is not required at the time of the application to use CLSA data, but no data or biospecimens will be released until proof of ethics approval has been received by the CLSA. Should your institution not require a full ethical review for the use of de-identified data, please provide a letter from your institutional review board to this effect. Ethics approval must be obtained only from the primary applicant’s institution, not from all of the institutions of the members of the project team.

What are the fees for access to CLSA data?

Currently the charge for partial cost recovery per application is CAD $3,000 for researchers based in Canada, and CAD $5,000 for researchers based at institutions outside of Canada. This includes access to the Baseline, Follow-up 1 or both datasets. Due to the additional work required to make some data research ready, additional fees apply for access to image files, raw data and datasets that require more complex customization. Please see the table of additional fees on the Fees page of our website. The fees are payable for the retrieval and preparation of the dataset per approved project, not for each project team member.


Do trainees have to pay for access to CLSA data?

Graduate students (MSc or PhD) who wish to obtain the CLSA data for the sole purpose of their thesis, and postdoctoral fellows (limit 1 waiver per postdoc) who wish to obtain the CLSA data for the sole purpose of their postdoctoral project who are enrolled at Canadian institutions for their graduate degree or postdoc, can apply for a fee waiver. Canadian trainees working outside Canada but funded through a Canadian source are also eligible for a fee waiver. The request for a fee waiver must be checked in Part 1 of the CLSA Data and Biospecimen Access Request Application. Trainees eligible for a fee waiver are also waived the supplemental data fee for images and raw data. Applications made through the CIHR Catalyst Grant for the Analysis of CLSA Data are not eligible for trainee fee waivers.

If I have a student or trainee as part of my project team, am I eligible to get a fee waiver for data access?

No, simply having trainees as part of the project team does not satisfy criteria for eligibility for a fee waiver.

When will data from the Follow-up 1 be available?

We anticipate first release of Follow-up 1 alphanumeric data in late spring 2019. Applications to access data from Follow-up 1 are accepted as of the first data access application deadline in 2019, on February 25.

If I already have an approved CLSA project, do I get access to Follow-up data as well?

When the Follow-up data are released, current users and new applicants will be required to complete a new CLSA Data and Biospecimen Request Application to access Follow-up 1 data. Follow-up 1 data cannot be requested through an Amendment to an existing project.

Can I apply for data that are not yet available, but will be later?

The CLSA only accepts applications for data that are available at the time of submission. To know what data are currently available, please consult the CLSA Data Availability Table which is regularly updated as data become available. Applications proposing the use of data that are not available at the time of submission will not be considered for review, and you will need to reapply once those data become available.