Open Research Online qmethod: A Package to Explore Human Perspectives Using Q Methodology

Q is a methodology to explore the distinct subjective perspectives that exist within a group. It is used increasingly across disciplines. The methodology is semi-qualitative and the data are analysed using data reduction methods to discern the existing patterns of thought. This package is the ﬁrst to perform Q analysis in R, and it provides many advantages to the existing software: namely, it is fully cross-platform, the algorithms can be transparently examined, it provides results in a clearly structured and tabulated form ready for further exploration and modelling, it produces a graphical summary of the results, and it generates a more concise report of the distinguishing and consensus statements. This paper introduces the methodology and explains how to use the package, its advantages as well as its limitations. I illustrate the main functions with a dataset on value patterns about democracy.


Introduction
Identifying the different perspectives on or attitudes towards topics of public concern is an important research objective in fields spanning social (e.g., Zografos, 2007), environmental (e.g., Sandbrook et al., 2011) and health sciences (e.g., Thompson et al., 2001).Q is a clearly structured, systematic, and increasingly-used methodology designed specifically for these purposes (Watts and Stenner, 2012;Barry and Proops, 1999).It is aimed at exploring the distinct perspectives, discourses, or decisionmaking styles within a group in order to address practical matters such as the acceptance of new policies and technology or increasing public participation.The method can be used, for example, to identify student learning styles, farmer attitudes towards natural conservation (e.g., Davies and Hodge, 2012;Brodt et al., 2006), user views on technology innovation (Petit dit Dariel et al., 2013), transportation habits (van Exel et al., 2011), citizen identities (Davis, 1999), heterogeneous concepts of love (Watts and Stenner, 2005), or leadership styles in business.
In essence, the data collected in Q methodology (also known as Q technique or Q-sort) consist of a set of items (usually statements) sorted in a specific arrangement.These statements represent all possible opinions, which each respondent sorts in order to express their views (usually from most agree to most disagree). 1The analytical process reduces the data based on principal components analysis (PCA) or factor analysis (FA).However, instead of correlating variables (as in regular PCA and FA), in Q the respondents are correlated in order to elucidate the relationships between them.The standard data reduction method is followed by a set of analytical steps specific to Q methodology.The final results consist of a small number of sets of sorted statements (typically called the factors), which are different from each other and summarise the perspectives existing among the respondents.These results can be used for further research: to model the relation between perspectives and other variables, to develop a quick test to identify perspectives in larger populations, or to understand the evolution of perspectives over time.
The analysis for Q methodology requires multiple matrix algebra operations which have been described in detail (see Chapter 4 and Appendix in Brown, 1980).The full analysis is implemented in software specific to Q, predominantly PQMethod, which is freely available, written in Fortran, and fully functional in Windows and Mac-OS (Schmolck, 2014).Other software include PCQ (paid-license, Windows only, Stricklin and Almeida, 2004) and Q-Assessor (paid-license, web-based, Epimetrics Group, LLC, 2010).The latter two provide tools for data collection, but the final output and report are virtually the same in all three.This R package improves the existing Q software in a number of ways.It is fully cross-platform.It allows a completely transparent examination of the equations and the constants used in the computation at each step of the analysis, helping researchers to better understand the process.For the data reduction technique, correlation coefficients other than Pearson are also allowed.The output is concisely structured and tabulated in numerical form rather than in a large fixed width text file, therefore it provides a more straightforward and flexible means to study and handle the results.Thus qmethod (Zabala, 2014) results can be easily used for further quantitative modelling and for graphical representation.In addition, the final output in this package provides a clearer and more synthetic report on distinguishing and consensus statements (see below).The package also includes a specific plot() method to build a novel visualisation of the results, as well as import and export functionality.

The Q methodology
Q is a powerful methodology to shed light on complex problems in which human subjectivity is involved.Subjectivity is understood as how people conceive and communicate their point of view about a subject (McKeown and Thomas, 2013).The method originated from a 1935 proposal to correlate respondents instead of variables in FA by Stephenson (Stephenson, 1935), an assistant to Spearman-the developer of FA.Q was used initially in psychology, then in political science, and, after that, in several other fields.The analytical process is clearly structured and well established (Stephenson, 1953;Brown, 1980), and Q is increasingly being used across disciplines and for different purposes such as policy evaluation, understanding decision-making, or participatory processes.
The following characteristics of the methodology will help in deciding whether it is a suitable approach for a given question.It is versatile due to its compatibility with small samples (see below).It is predominantly exploratory because the patterns of views emerge from the study and thus prevent the researcher from imposing a frame of reference or pre-determined assumptions and definitions (McKeown and Thomas, 2013;Stenner et al., 2008).It is a mixed or semi-qualitative methodology because though the data collected are quantitatively analysed, their interpretation is extensively qualitative (Ramlo, 2011) and makes thorough use of theory.The results can be used in combination with other qualitative methods and as a starting point for quantitative confirmatory methods.For example, Q can be combined with discourse analysis, or it can be used in regression models to examine how perspectives influence behaviour.The basic analytical principle is to correlate the entire responses of individuals.These responses are measured using the same unit, which is often called psychological significance or self-significance, and they indicate the salience (engagement or disengagement) of the statement for the respondent.Both aspects contrast with regular FA, in which variables are correlated and, having different units, may also be incommensurable.

Research design
In its most frequent form, the Q approach consists of selecting a set of statements and asking respondents to sort them over a grid, from most agreement to most disagreement (see Figure 1 for an example of a grid).The statements are a representative sample of the concourse, the whole set of possible expressions on a topic, gathered from all possible points of view (in theory, a concourse would be infinite).The researcher collects a large set of statements from interviews, reviews of literature or mass media, expert consultation, participant observation, etc.This collection is reduced to a final representative selection that usually ranges between 40 and 80 statements (Watts and Stenner, 2012).The statements can express understandings or behavioural preferences relating to the topic.Occasionally, photos, sounds, or other types of stimuli may be used instead of statements.
The sample of respondents does not need to be large or representative of the population, but it must be diverse.The aim is to get the most diverse range of opinions, regardless of whether they are minority ones.The shape of the grid used to sort the statements is up to the researcher.This grid is usually bell-shaped as in Figure 1, assuming that fewer statements generate strong engagement (Brown, 1980).Respondents commonly sort the statements according to their agreement or disagreement, although there are other possible conditions of instruction-different ways in which participants are The R Journal Vol.6/2, December 2014 ISSN 2073-4859 asked to sort the statements (McKeown and Thomas, 2013), e. g., "Sort the statements according to how person A would respond".A succinct description of the research design can be found in van Exel and de Graaf (2005), Watts and Stenner (2012) offer a detailed reference manual, and a key and extensive work is that of Brown (1980).

Analytical process
The data collected from all respondents are introduced into a matrix with statements as rows and respondents as columns, where the cell values are the score in the grid in which the respondent sorted the statement.For example, in Figure 1 the statements that a respondent most disagreed with would get a score of −4.Sample data available with the package can be loaded by using the command data(lipset).The array of scores for all the statements sorted by a single respondent (the column) is called the Q-sort .
The process of analysis has two main parts.In the first, a multivariate data reduction technique is applied, either centroid factor analysis or PCA.This package currently implements only the PCA solution.PCA is readily available in R, and the results from both techniques are similar (McKeown and Thomas, 2013;Watts and Stenner, 2012).The centroid algorithm for factor analysis is an alternative method for FA used almost exclusively in Q methodology and described in Brown (1980).This algorithm differs from standard FA (as implemented in factanal()) and their results, although highly correlated, are not identical.
Initially, a correlation matrix between Q-sorts is built, and the chosen multivariate technique reduces this correlation matrix into components.The components are ordered by the total variability that they explain, and so the first components summarise most of the variability of the initial correlation matrix.Then the first few components are selected and rotated in order to obtain a clearer and simpler structure of the data.The usual criteria by which the number of components is selected include, inter alia, the total amount of variability explained, eigenvalues higher than a certain threshold-both accessible through the call loadings(principal(...)) from psych (Revelle, 2014), and a compromised solution between complexity and interpretability (further details about the possible criteria are given in Watts and Stenner, 2012).
The rotation of components in Q studies can be either manual (judgemental) or mathematically optimal (analytical).The rotation results in a matrix of component loadings with Q-sorts as rows and components as columns, indicating the relationship between each Q-sort and component.Mathematical rotation is implemented in the package within the function qmethod().This function calls internally principal() from psych, which conveniently wraps the rotation modes from GPArotation (Bernaards and Jennrich, 2005) into a single function.Any of the rotations implemented in principal() can be called in qmethod, and "varimax" is the most commonly used.Manual rotation is not integrated in the current version. 2he second part of the analysis is particular to Q.It consists of a) flagging the Q-sorts that will define each component (hereafter called factor, as it usually is in the literature; implemented in the function qflag()), b) calculating the scores of statements for each factor (z-scores and factor scores, implemented in qzscore()), and c) finding the distinguishing and consensus statements (implemented in qda()). 3he most representative Q-sorts for each factor are flagged (a), meaning that only these Q-sorts are used for subsequent calculations.The purpose of flagging is to obtain more distinguishable perspectives, and it may be done either automatically or manually, the latter occurring when the researcher has relevant knowledge about any of the respondents.Automatic flagging is based on two criteria: that the loading should be significantly high (the significance threshold for a p-value < .05 is given by equation 1, where N is the number of statements; Brown 1980), and that the square loading for a factor j should be higher than the sum of the square loadings for all other factors (equation 2, where f is the total number of factors; Brown 1980).Some Q-sorts may be considered confounding because they load highly in more than one factor and thus they are not flagged.Alternatively, manual flagging may be used (see details on how to run manual flagging in Implementation of the analysis in qmethod).
The z-scores (b) indicate the relationship between statements and factors, i. e., how much each factor agrees with a statement.The z-score is a weighted average of the scores given by the flagged Q-sorts to that statement.The factor scores are obtained by rounding the z-scores towards the array of discrete values in the grid.In Figure 1, this array of discrete values would be c(-4,-4,-3,-3,-3,-2,... ,4). 4he final outcome of the analysis is the selected number of factors, representing one perspective each.These perspectives are a hypothetical Q-sort that has been reconstructed from the factor scores.
Next, some general characteristics are calculated in order to compare the factors.For each factor, the following are provided: the number of flagged Q-sorts, the composite reliability, and the standard error (SE) of factor scores.Two additional matrices indicate the similarity between the z-scores of each pair of factors: the correlation coefficients and the standard error of differences (SED , based on the SE). 5inally, the factor comparison identifies the consensus and distinguishing statements (c).For each pair of factors, if the difference between the z-scores of a statement is statistically significant (based on the SED), then what both factors think about that statement is distinct.When none of the differences between any pair of factors are significant, then the statement is considered of consensus.

Interpretation and reporting
The interpretation of each perspective is based on the Q-sort reconstructed from the factor scores and on the salience and distinctiveness of the statements.Each respondent may be more closely related to one of the perspectives, and this relation is determined by the loadings calculated at the beginning.
The key elements to look at are the relative position of statements within the grid (particularly those at the extremes), the position of a statement in a perspective versus the position of the same statement in other perspectives, and the distinguishing and consensus statements.Each perspective is given a semantic denomination and is described in as much length as necessary, each description deriving from the literature and from qualitative explanations collected after each response.
The essential characteristics of a Q study include the process of selecting statements, the shape of the distribution grid, the number of participants and the criteria for their selection, the methods for extraction and rotation of factors, and the number of Q-sorts loading on each factor.The results are usually reported with a table of statements including either their z-scores or factor scores, and an indication of which statements are distinguishing and which consensus.The table of factor loadings may also be included, showing the Q-sorts that were flagged.

Implementation of the analysis in qmethod
The core of the package consists of a main function qmethod() and four subordinate functions that conform to the steps of the analysis: qflag(), qzscores(), qfcharact(), and qdc().The function qmethod() is a wrapper that calls internally PCA to calculate loadings and the four other functions.The individual functions can be run independently to build the analysis step-by-step in order to maintain more control over what happens at each stage or to perform more advanced analysis.Yet running the individual steps will rarely be necessary unless the researcher wants to use other methods for extraction or manual flagging.The core functionality is complemented with additional functions to print, summarize, plot, import, and export.
The raw data is provided to qmethod() as a matrix or data frame with statements as rows and Q-sorts as columns.The number of factors to extract is necessary, and this can be decided upon exploration of the raw data based on criteria recommended in the literature, as explained above in Analytical process.The method for rotation is "varimax" by default, but other methods can be specified.If respondents do not have to follow the distribution grid strictly when sorting the statements, then the argument forced should be FALSE and a vector must be provided in the argument distribution.This distribution vector is the array of values corresponding to the grid.By calling qmethod() with all the necessary arguments, the full analysis is performed and the outputs are put together in an object of class "QmethodRes".
In order to run manual flagging, the functions corresponding to individual steps may be used instead of qmethod(): namely, qzscores() and qdc() (qfcharact() is called internally in qzscores()).First, and in order to assess which Q-sorts to flag, one may run the function qflag() and examine the resulting table of loadings.Second, in qzscores() a logical matrix of n Q-sorts and f factors may be provided in the argument flagged, where the cells may be TRUE to indicate flagging.After calculating the z-scores, distinguishing and consensus statements may be identified using the function qdc().
The package also allows the use of correlation coefficients other than Pearson for the extraction of factors, namely Spearman and Kendall.These may be appropriate for non-parametric data and may sometimes enable a greater amount of variability to be explained with fewer factors (for a technical note about correlation coefficients, see Brown, 1980, p. 276).

Understanding and exploring results from the qmethod() function
The function qmethod() returns the results in a list of class "QmethodRes" containing eight objects.The method print() for an object of class "QmethodRes" provides a snapshot of the full results with descriptive names for each object within the list, as listed below (in parenthesis, the actual names of the objects within the list). 6The method summary() displays the essential tables.In order to visualize the results at a glance, the method plot() builds a dot-chart of z-scores, as in Figure 2.
1. "Q-method analysis" (...$brief): a list with basic information of the analysis including date, number of Q-sorts and of statements, number of factors extracted, and rotation.
8. "Distinguishing and consensus statements" (...$qdc): a data frame that compares the z-scores between all pairs of factors, obtained from qcd().
The last object "Distinguishing and consensus statements" may be explained in detail.This object results from an internal call to the function qdc().For each pair of factors, this function calculates the absolute difference in z-scores and compares this difference with the significance thresholds for .05and .01p-value levels.The function qdc() returns a data frame with statements as rows and comparisons as columns.All the comparisons are synthesised in the first variable of the data frame, which is a categorical variable named "dist.and.cons" that indicates whether the statement is of consensus or distinguishing for one or more factors (see an example below in Usage example).The following are the possible categories that a statement can fall into in the "dist.and.cons"variable: • "Distinguishes all": When all the differences between all pairs of factors are significant.
• "Distinguishes f i only": When the differences between factor i and all other factors are significant, and the differences between all other pairs of factors are not.
• "Distinguishes f i (...)": When the differences between factor i and all other factors are significant, and some (but not all) of the differences between other pairs of factors are significant.If this is the case for more than one factor, the string is concatenated, e. g., "Distinguishes f1 Distinguishes f3".This category may arise only in solutions of four or more factors.
• "Consensus": When none of the differences are significant because all factors give the statement a similar score.
• "": Leaves an empty string in the cell of those statements which do not fulfil any of the above conditions, i. e., statements that are neither consensus nor clearly distinguishing any factor from all the rest.But while they do not distinguish any particular factor from all the rest, they do distinguish some pairs of factors.The role of these statements may be inspected in detail by looking at the significance columns.
This structure of results is different from that of other Q software and it contains all the necessary information without any redundancy.This output can be converted into the exact outline provided by PQMethod by using the function export.qm()(see below), an outline that is much longer.Most of this conversion consists of taking the data frames of z-scores, of factor scores, and of distinguishing and consensus statements (objects 5, 6, and 8 within the list of results), and reordering or merging them according to different criteria.

Importing and exporting data
The function import.pqmethod()retrieves data from a .DAT file, which is the raw data file saved by PQMethod software.Individual data frames from a "QmethodRes" object may be exported as a CSV using, for example, write.table()(to find the objects to export from within the list of results, see the description of the outputs above in Understanding and exploring results from the qmethod() function).The function export.qm()saves all the results obtained from qmethod() in a text file, building the report which is then used for the interpretation.This report has two flavours defined in the argument style: "R" and "PQMethod"."R" exports the results exactly as the function qmethod() returns them."PQmethod" exports the results following the structure of the output in PQMethod (a .LIS file).Note that the latter is a much longer outline and has some redundant information in the form of tables reordered according to different criteria.This alternative outline might be convenient for researchers accustomed to PQMethod.

Usage example
For demonstration purposes, I use the Lipset dataset about the value patterns of democracy (Lipset 1963and Stephenson 1970, both in Brown, 1980), which contains 9 respondents and 33 statements.The following code performs a full analysis using principal components and varimax rotation to extract three components (factors).

data(lipset) lipset[[1]
] # Shows the dataset, a matrix of 33x9 lipset [[2]] # Shows the text of the 33 statements results <-qmethod(lipset[[1]], nfactors = 3, rotation = "varimax") The object results is of class "QmethodRes", and the specific method summary() for this class returns the basic information and the data frame of factor scores as shown below.This data frame contains the three factors or main perspectives.Each perspective has a distinct array of statement scores, which correspond to the scores in Figure 1 and indicate the agreement or disagreement of the given perspective with each statement.For example, perspective one is in strong agreement with statement 1 ("sta_1" has a score of 4), whereas the statement deserves the opposite opinion according to perspective two (a score of −2) and perspective three considers it in the middle ground (a score of 1).The next matrix contains general information about each factor, of which the most relevant piece may be the number of loading Q-sorts and the explained variance, which are approximate indicators of the strength of each perspective and of the proportion of the opinions they explain.
Average reliability coefficient 0.80 0.80 0.80 Number of loading Q-sorts 3.00 3.00 3.00 Eigenvalues 2.09 1.97 1.68 Percentage of explained variance 23.17 21.93 18.68 Composite reliability 0.92 0.92 0.92 Standard error of factor scores 0.28 0.28 0.28 Any of the results may be retrieved by using the corresponding object name indicated under Understanding and exploring results from the qmethod() function, and thus customised for easier exploration.For instance, the z-scores may be shown by using the command results$zsc.In the example below, the factor scores are merged with the actual text of the statements and then ordered.The data frame is reordered according to the scores of the statements for each factor, so that the researcher can quickly identify which statements are in most agreement for a given perspective, and what other perspectives think of the same statements: # Merge the statements with their actual text: scores <-cbind(results$zsc_n, lipset [[2]]) # Order the results by the scores of each factor: for (i in 1:length(results$loa)) { View(scores[order(scores[i], decreasing = TRUE), ], title = paste0("Order for f", i)) } The R Journal Vol.6/2, December 2014 ISSN 2073-4859 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −3 − The method plot() for class "QmethodRes" returns a dot-chart of the z-scores specifically adapted for Q methodology, as in Figure 2. In this figure, built with the code below, the comparison among the z-scores of all factors can be explored.For example, all three points are far from each other in statement 33, meaning that each of the three factors holds a distinctive opinion regarding this statement.For statement 20, however, the points are clustered together, indicating consensus.Finally, statement 16 clearly distinguishes factor three from the rest (its point being far from the other two).
par(lwd = 1.5, mar = c(4, 4, 0, 0) + 0.1) plot(results) abline(h = seq(from = 2, to = 32, by = 3), col = grey(0.2),lty = 2) The table of distinguishing and consensus statements below conveys the observations gleaned from Figure 2 with greater precision.For example, column "f1_f2" shows the absolute difference in z-scores between factor one and factor two.In the column immediately to the right ("sig_f1_f2"), a single star or double star indicate differences that are significant at p-values < .05 and < .01respectively, and arise from the magnitude of the difference and the thresholds given by the SED.In the above example, the statements 3, 6, 8, 9, etc. (labelled "Distinguishes f1 only") distinguish factor one ( f 1) but do not distinguish f 2 from f 3. The statements 1, 7, 15, 19, and 33 (labelled "Distinguishes all") distinguish both f 1 from the other two and also f 2 from f 3: all factors think differently about these statements.Meanwhile, statements 14 and 20 are of consensus because none of their differences are significant at p-level = .05(no stars appear in any of the "sig_*" columns).In addition, those statements with empty values under "dist.and.cons"need to be looked at individually (statements 12, 17, 22, 25, 27, and 29).For example, statements 12 and 25 distinguish f 1 from f 3, but they do not distinguish either against f 2 (whose p-value is < .05 as indicated in column "sig_f1_f3", but none of the other comparisons are significant).

Validation
The package was validated with the lipset dataset and with three other datasets, extracting 2, 3, 4 and 5 factors with each of them.The results of qmethod were contrasted with the results of analyses based on the same options but performed in PQMethod.For studies of 1 to 3 factors, all the numbers in factor loadings and z-scores match to the fourth decimal those given in PQMethod.For studies of 4 or more factors, all the numbers match to the second decimal.Occasional divergences in the third and fourth decimals of the loading values arise from the PCA algorithms themselves, which are coded externally to this package. 7The factor scores match in all cases.
The selection of distinguishing statements matches exactly.A difference in the selection of consensus statements is due to a greater restrictiveness in this package.For in qmethod, the only statements identified as consensus are those in which none of the differences are significant at p-value < .05(that is, only those statements which do not produce stars in any columns).PQMethod also indicates consensus statements with no significances at p-value < .05 with a star, but it further identifies as consensus those statements with some differences significant at a p-value between .01 and .05(these statements have single stars in some of the comparisons, though no double stars).In PQMethod, therefore, the statements with differences significant at a p-value between .01 and .05are shown both consensus and the distinguishing lists for some or all of the factors.For example in the above table, statements 12, 17, 25, 26, and 29 have no double stars but have one or more single stars; in PQMethod these would be included as both distinguishing and consensus statements.Such double labelling can be confusing in the interpretation.Whereas in this package the statements with differences significant at a p-value between .05 and .01 are not labelled as consensus, but rather as "Distinguishes f*", "Distinguishes all", or "", depending on each case.The role of each statement can be fully understood by inspecting the table of distinguishing and consensus statements.
The order of factors in the matrices (e. g., in the matrix of loadings) may differ between both tools in some cases.This is because in R, the components in PCA are ordered according to the explained variance of the rotated components.In PQMethod, the factors are ordered according to the explained variance of the unrotated factors instead.This discrepancy affects neither the numerical results nor the interpretation.

Summary and future work
Q is an effective methodology for understanding the diversity of perspectives across disciplines.qmethod is the first R package to analyse Q methodology data.This package produces tabulated results that are easy to examine and interpret, and ready for graphical representation or further numerical analysis.It provides a more concise output of distinguishing and consensus statements as well as a synthesising plot function.This core functionality is complemented by additional functions that import data from other Q software, summarise the results, and export the outputs in plain text for the interpretation in two flavours.Further usage details can be found in the qmethod reference manual available from CRAN.Potential developments for the current implementation include the introduction of centroid extraction as an alternative to PCA, manual rotation of factors, a graphical interface, functions for data collection, and a 3D plot method to explore the results further.Researchers who would like to contribute to these or other developments are welcome to contact the author.

Figure 1 :
Figure 1: An example of a grid to sort statements in Q method.

Figure 2 :
Figure 2: An example of a plot of an object of class "QmethodRes".The statements are ordered by the standard deviation of z-scores for all three factors.