User talk:Holon

From Wikipedia, the free encyclopedia

1 Welcome from Redwolf24
2 Rasch models
3 Re Georg Rasch
4 Comment added to your user page
5 Regarding changes to Stevens' Power Law
6 Further discussion on measurement
7 Question about CAT and dependent items
8 Rubrics for assessment

[edit] Welcome from Redwolf24

Welcome!

Hello, and welcome to Wikipedia. Thank you for your contributions. I hope you like the place and decide to stay. We as a community are glad to have you and thank you for creating a user account! Here are a few good links for newcomers:

The Five Pillars of Wikipedia
How to edit a page
Editing, policy, conduct, and structure tutorial
Picture tutorial
How to write a great article
Naming conventions
Manual of Style
Merging, redirecting, and renaming pages
If you're ready for the complete list of Wikipedia documentation, there's also Wikipedia:Topical index.

I hope you enjoy editing here and being a Wikipedian! By the way, please be sure to sign your name on Talk and vote pages using four tildes (~~~~) to produce your name and the current date, or three tildes (~~~) for just your name. If you have any questions, see the help pages, add a question to the village pump or ask me on my Talk page. Again, welcome!

Redwolf24 9 July 2005 08:48 (UTC)

P.S. I like messages :-P

[edit] Rasch models

I moved Rasch model, and a few other psychological and sociological terms, out of category:measurement because it seems inconsistent to have only a few topics on psychological measurement in the Category:Measurement, and the category appeared to be in some disarray. Especially as there is the Category:Psychometrics, which I made into a subcategory of measurement, since "Psychometrics is the field of study concerned with the theory and technique of psychological measurement" according to Psychometrics. So Rasch model and the few other topics which were in both Category:Psychometrics and category:measurement, are now just in Category:Psychometrics which is now a subcategory of measurement. Salsb 11:26, 28 September 2005 (UTC)

[edit] Re Georg Rasch

I edited the article to amend a poor use of English. You made a change to my edit, which reinstalled a poor use of English, so I reverted. I didn't notice that in that edit you had also included two new pars. I am sorry about that. But, that section excluded, you have now commented that "Changes made were factually incorrect". What changes were factually incorrect? Moriori 07:52, 30 November 2005 (UTC)

Moriori, I agree with you there was a problem in one sentence and it's good you pointed it out, but you changed that sentence to read: He was a member of the International Statistics Institute where he studied with Ronald Fisher ... To my knowledge, Rasch didn't study at the Institute with either person. How about trying to be constructive ... "amend poor English" comes across as being a tad abrasive. BTW, I assume you mean 'reinstated', not 'reinstalled'? I see from your talk page you've edited other articles very early - seems to be something of a difference in viewpoint from some contributors. I wanted to get this started, and I'll come back and clean it up myself. I try not to begin articles in too rough a form, but life's busy! I welcome the edits, but when I saw that paragraphs had been removed without a stated reason, and that a factually incorrect sentence had been introduced (inadvertently no doubt) I thought it better to revert. Anyhow, if you can, let me know which sentences you think need work, otherwise I'm sure I can figure that out and make appropriate changes later. Take care smhhms 08:18, 30 November 2005 (UTC)

Hi smhhms. OK, so Rasch didn't study at the institute with both persons. But your sentence stated the following He was a member of the International Statistics Institute who studied with Ronald Fisher and also, briefly, with Ragnar Frisch. It is easy to see why someone could interpret that to mean he studied at the institute with those persons.

Well, not easily, no. Frisch founded an Institute of Economics. Among those reading this sort of article, not many would interpret it this way. Nevertheless, you're quite right that the structure of the sentence needed improvement - and I can just about guarantee I would have done so pretty soon.

BTW -- instate, vt, to install. You say tomatos and I say tomatos. (Some even say tomatoes).

Matter of etymology - never seen reinstalled used in that way but hey, what would I know, I'm just an Aussie ;->

You note that I have edited other articles very early. I sure have, because they have desperately needed attention. Wiki is immediate. People who come here judge the quality of articles at the time that they read them. The reputation of Wiki rests on veracity and quality, and incoherent stubs and other nonsense that slip through the system drag it down. That comment is a generalisation and is not aimed at you or any other particular contributor. I agree life is busy, and that some people may intend to come back later to make appropriate changes to sentences/paragraphs that need work. However, in the meantime, someone visits us, reads an untweaked entry, and wonders about the quality of Wiki. Incidentally, I don't claim to be perfect. My original contributions have been edited by others and I welcome constructive changes. I have edited ambiguous articles to say something that the original poster never intended, and got a rocket for doing so. But, the good part of it is that the articles end up being a lot less ambiguous than when originally posted. Cheers Moriori 21:34, 30 November 2005 (UTC)

Editing is great, and when constructive it generally leads to greatly improved results. Personally, I find it much easier to start on an article and see what it looks like on screen. This means that for a day or two, or maybe a little longer, it's a bit rough. With a new article to which few others are linked, I very much doubt there will be a great deal of traffic early (other than among those looking specifically at new articles). Indeed, I'd be willing to bet on the number of visits being roughly Poisson with mean a function of (among other things) number of links and longevity. BTW, I see your view on skepticism is a hell'v'a lot like mine. Great to see. Take care mate smhhms 02:01, 1 December 2005 (UTC)

[edit] Comment added to your user page

Misplaced comment moved here. --cesarb 09:31, 24 February 2006 (UTC)

Dear Holon, both intro to Quantity page must be somehow merged; for as now its not good: the definition is not strong, quantity is not a relation, etc. I wait your combined version today. Thanks —The preceding unsigned comment was added by Azamat Abdoullaev (talk • contribs) 08:45, 24 February 2006 (UTC)

I am not going ahead but waiting your version of merging of the introduction to Quantity. Below is my part.

'Quantity is among the basic classes of things along with quality, substance, change, and relation. Initially, quantity was introduced as quantum, an entity having quantity. Generally, quantity is viewed as the basic property of things existing as magnitudes or multitudes, or the state of being much. Being a fundamental term, quantity is used to refer to any type of quantitative properties or attributes of things. Of entities which pertain to quantities, some are such by their inner nature (as number), while others are functioning as states (properties, dimensions, attributes) and modifications like as heavy and light, long and short, broad and narrow, small and great, or much and little. Two basic differences of quantity, magnitude and multitude (or number), imply the principal distinction between continuity (continuum) and discontinuity. Under the names of multitude come what is discontinuous and discrete and divisible into indivisibles, all cases of collective nouns: army, fleet, flock, government, company, party, people, chorus, crowd, mess, and number. Under the names of magnitude come what is continuous and unified and divisible into divisibles, all cases of common names or mass nouns: the universe, matter, mass, energy, liquid, material, animal, plant, tree.'

Azamat Abdoullaev, 24 February 2006

Okay, I'll look at it as soon as I can Azamat. Holon 15:00, 24 February 2006 (UTC)

Please discuss at talk:quantity. I have detailed issues regarding parts of the proposed text. Cheers Holon 05:23, 25 February 2006 (UTC)

I'll be on vacation shortly (I'm having trouble transcluding the announce on both my talk page and my user page.) I may comment in the next few minutes, or when I get back. — Arthur Rubin | (talk) 02:00, 3 March 2006 (UTC)

Cheers Holon 02:35, 3 March 2006 (UTC)

[edit] Regarding changes to Stevens' Power Law

Dear Holon,

I understand that you would like to retain some details in the criticism section of the entry Stevens' power law. In this context I would like to point out a couple of things.

You retained, "...consider three stimuli x, y, and z. If it is reported, for example, that the ratio of perceived intensity of z:y is 2:1, and of y:x is also 2:1, then it should be reported that the ratio of perceived intensity of z:x is 4:1." This is equivalent to numbers not being judged in a veridical way, as it (now) said above. This was obviously not clear in my editing.

You retained, "Narens (1996) formally stated and tested these assumptions, and reported negative results." Narens did indeed formulated the underlying assumptions, but he neither performed nor reported any tests.

I have reformulated the entry to both incorporate what you wanted to retain as well as the rest of Narens' (1996) results. Specifically, I've spelled out Narens' (1996) multiplicative property, and the associated (negative) empirical results. I have also added his commutative property and the associated (positive) results. Together, these show that Stevens' assumption of veridical judgments of numbers is wrong, but ratio scaled judgments seem correct. I've also added recent empirical results on the power law in the context of axiomatic psychophysics (when will someone write an entry on that discipline?).

I hope you will find this to include what you wanted to retain.

(Rutuag 07:03, 24 June 2006 (UTC))

Thanks -- I hope you won't take anything I say as anything other than for the sake of clarification and refinement. I think it is good to discuss these things. Mostly, I'm happy with the changes. I do not like the wording "veridical interpretation of numbers". Can you clarify what this means? The point is not how numbers are interpreted; it is whether the numbers are measurements of perceptions. If this is exactly the way it is stated in the literature on axiomatic psychophysics, fair enough, though I think it is very misleading.

I find your response very reasonable. The term "veridical" is sometimes used (Steingrimsson & Luce, in press) but I agree that it may not be entirely satisfactory. There are two things here. First, what meaning is the intent to convey, and second, is there are better way to express that meaning.

The meaning: this is best explained formally, but I was trying to avoid doing that in the entry, perhaps not a good choice? Let

W

be a cognitive function capturing respondents interpretation of numbers,

p

, then by veridical I try to capture the identity function, namely

W (p) = p

I’m not very familiar with the formalisms in axiomatic psychophysics. How might

W (p)

be evaluated? I mean, what kind of test is used in this context to evaluate whether

W (p) = p

from an experimental outcome? I assume it would be evaluated, for example, just by comparing the produced magnitude with the reference to see if it is in fact p time the physical magnitude of the reference.

As I wrote below, Narens' (1996) multiplicative axiom tests directly whether

W (p) = p

Discussion continued below ... Holon 05:48, 28 June 2006 (UTC)

Second, my use of "veridical" comes from Steingrimsson & Luce (in press), where they write, "...veridical interpretation of numbers: some degree of distortion on the part of the respondent is to be expected." So, perhaps using that clarification would help? Zimmer (1995) talks of "interpret [numbers] as “true” scientific numbers." which I like less. Narens (1996) talks of "subject's subjective interpretations of numerals", but one would have to add something like, ", which is not the identity function". Perhaps one could say, "respondents interpret numbers not identical to their mathematical use" but that may be more murky. So, you can see, good wording doesn't come easy in this case.

Narens' (1996) multiplicative axiom tests directly whether

W (p) = p

and the several subsequent experiements (Ellermeier & Faulhammer, 2000; Zimmer, 2005, [the list could be extended]) found it to fail.

"Without assuming veridical interpretation of numbers, Narens (1996) formulated another property that, if sustained, meant that respondents could make ratio scaled judgments, namely, if y is judged p times x, z is judged q times y, and if y' is judged q times x, z' is judged p times y', then z should equal z'. This property has been sustained in a variety of situations (Ellermeier & Faulhammer, 2000; Zimmer, 2005)."

What I tried to capture by "Without assuming veridical interpretation of numbers,..." is summary of what Narens (1996) writes, "...there are ratio magnitude estimation situations in which the multiplicative property fails but the commutative property holds and the situation can be measured in such a way that (i) a ratio scale S on the stimuli results, and (ii) there is a strict order preserving mapping [W] from the numerals...into the positive real numbers..." That is, for all stimuli

x

and

t

, all numerals

p

, and with the psychophysical function

ψ

, then if

x

is judged

p

times

t

, then

ψ(x) = W (p)ψ(t).

In sum, the commutative property by itself is sufficient to ensure ratio-scale measurement. If the multiplicative property holds, it means

W (p) = p

. Since it fails, we have $W(p) \neq p$ . That's it. Perhaps this is the way to go, simply talk about it formally. What do you think?

The way you explained is fine to me – I’m just not clear on precisely what constitutes "veridical interpretation of numbers". I’m pretty sure I understand now -- I think it means "interpretation" that would literally make the number a measurement of physical magnitude.

It simply means that there is no cognitive distortion. With

W

a function describing how humans interpret numbers, then veridical means

W (p) = p

(which is not the case). Your second point: there is a subtle but fundamental point to be made here: measurement with numbers is simply the assignment of numbers according to a rule (Stevens' definition). Psychological measurement aims not to quantify the physical magnitude, but rather to map the relation between physical magntidue and the psychological one. Put another way, it is concerned with measuring sensation.

Believe me, I'm well aware psychological measurement aims not to quantify the physical magnitude.

I only wrote this in response to your saying, "I think it means "interpretation" that would literally make the number a measurement of physical magnitude."

I discussed precisely this in detail in Ch1 of my thesis [1]. Stevens' definition of measurement is defective (Michell, 1999). See psychometrics, where I have outlined some of the background to Stevens definition of measurement. Michell shows in detail why it is defective, both from a representational point of view and a classical point of view (the latter being simply the definition of measurement in physics). Yes, psychophysical measurement is concerned with measuring sensation -- but you still must demonstrate that you meet criteria for measurement, unless you adopt Stevens' definition.

You take my words too literally. If I were arguing Stevens' version of measurement and that to be the end of it, we wouldn't be having this conversation. The entire thrust of the discussion is what underlay his methodology.

Have you read Thurstone's work? See Law of comparative judgment for an overview. Please keep in mind the axiomatic approach is not the only approach.

Do you bring Thurstone into the picture as another approach to measurement? If so, there is no contradiction: to stay with the theme, see e.g., Luce (1994) on this (http://www.imbs.uci.edu/personnel/luce/1994/Luce_Psychological%20Review_1994.pdf). The issues that separate axiomatic measurement approaches and, say, psychometrics are not lost on me, nor are Michell's issues either. I can't think of a better way to cast the problems of this than by the following "The failure of measurement to "take" in cognition and psychometrics is related to a deep conceptual question concering the relationship between statistics, as away of describing randomness, and measurement, as a way of describing structure. The lack of an adequate theory for this relationship is, in reality, a weakness of both fields." (Luce & Narens, 1993) (http://www.imbs.uci.edu/personnel/luce/1993/NarensLuce_PsychScience_1993.pdf).

I think it vital not to confuse foundational issues of measurement and approaches to measurement: in this context it is perhaps instructive to quote Michell, "The measurability thesis, the rock upon which quantitative psychology is built, and conjoint measurement theory, psychology's best chance of checking the foundations upon which this rock stands..." (Michell, 1999, p. 213).

W (p)

above simply a monotonic function?

W (p)

is usually required to be monotonic and order-preserving, i.e., for two numbers

p, q

p < q

iff

W (p) < W (q)

. If you are interested in the latest on this subject, you can look at Steingrimsson & Luce (in press) as a technical report at http://www.imbs.uci.edu/tr/abs/2006/mbs06_03.pdf.

Do you know what tests the results were subjected to in order to claim "a ratio scale S on the stimuli results"?

Narens (1996) showed his commutative property is sufficient to establish a ratio scale.

A ratio scale requires demonstration that experimental outcomes are a function of the ratio of a magnitude to a unit. Formalisms are usually consistent with this. Quantitative structure is a scientific hypothesis. The devil is in the detail. The problem often lies in tests that are not sufficiently sensitive to relevant departures to show that measurement fails. I'll check it out anyhow.

I am unsure as to what you are getting at here. It sounds as if you are a priori sceptical about what has been discussed but will not let that deter you from further exploration. The company of Suppes, Krantz, Tversky, Tukey, Luce, Narens, and others is well worth a visit. The other part of what you write seems to hark back to the issue of error in measurement. I talked some about that in the above and some more in the following.

Are you familiar with measurement theory and tests for double cancellation and the like? Do you know whether any such tests were used?

Yes, I am somewhat familiar with this literature and the property of double cancellation. In this context, you may be interested in Steingrimsson & Luce (2005a), in which they test the Thomsen condition, a close relative of double cancellation. So, yes it has been tested in the auditory domain. The property is related to extensive measurement and is necessary and sufficient to establish an additive representation. A web-search reveles that the paper I mentioned as well as many related ones are available on Luce's webgage (http://www.imbs.uci.edu/personnel/luce/luce.html).

I'm aware of the Thomsen condition, but there are a lot more issues. Joel Michell and others claim that generally, such attempts fail to properly demonstrate measurement has been achieved. I have a friend who knows much more than I do about this. I might get in touch with him and get his view about this literature. Thanks for the url, that's great.

Michell (1999) writes, "...this prediction provides a specific test of the hypothesis that the attributes are quantitative...this test is called the Thomsen condition...a key condition in the theory of conjoint measurement...the important point is that a way, distinct from extensive measurement, had been specified whereby the hypothesis that an attribute has additive structure could be tested." (pp. 202-203). This is the same point as I attempted to make.

There must be some error; i.e. any results would only hold stochastically (unless all magnitudes were far apart).

Are you referring to the statistics used? Behavioral axiom testing is usually of the form

A = B

where

A, B

are reduced to numbers and then their equality is tested. Most commonly, such tests are done using non-parametric tests such as the Mann-Whitney U.

I will try to get hold of this reference. First, I do not follow how this is "without assuming veridical interpretation of numbers". I'll await clarification of the point above. Second, are you able to describe the methods used? I find it difficult to believe there is any compelling evidence it is possible to get ratio measurements of perceived magnitudes, particularly just by "judging numbers". Thanks again. Holon 11:05, 24 June 2006 (UTC)

I hope I've addressed the first question in the above. By "describing the methods used", do you mean the estimation methods used in the published studies? In that case, with some variations, Ellermeier & Faulhammer, Zimmer, and Steingrimsson & Luce, used magnitude productions in which respondents were presented with a number and instructed to adjust a tone to be the prescriped number times a reference tone. The reference was heard first, followed by the adjustable tone. Respondents adjusted the tone until they were satisfied with their judgment. Each judgment was made many times intervoven with different stimuli and numerals, as needed.

Yes, I meant these methods, but also statistical or other methods to test the hypothesis that measurements of perceived magnitudes were obtained. See my question about tests above. There's no way you could get deterministic results except in trivial cases where there are large differences between magnitudes and the respondent expects some stimuli to be the same give the prompts in the experiment. I'll have to try to get hold of the reference. The whole approach is very different to the way we would go about it in psychometrics. We would not ask respondents to report numbers; rather only to judge relations of greater or less than.

I don't follow what you mean by "There's no way you could get deterministic results except in trivial cases". I suggest you read the Steingrimsson & Luce (2005a,b; 2006, in press) whose web location I gave in the above. They should both give you an idea of previous work as well as the current ones. The nice thing about what they have done is that it a comprehensive study of all the issues you have raised in a single series of papers.

Thanks, I will check it out. The Rasch model has theoretical properties of measurement as required in physical sciences, and as per the representational theory. It is a stochastic measurement model. Psychological data are never deterministic except when the distances are very large, in which case the results are trivial (because relations of more than and less than are obvious). My PhD focuses on the unit of psychological measurement. Establishing a unit is necessary to measurement (in the classical theory, which is consistent with all of the natural sciences). Establishing a unit of psychological measurement is a very difficult task. In the Classical Theory, measurement is defined as the estimation of the ratio of a magnitude to a unit.

As the quote from Luce (1994) stresses, the issue of variability is one of on-going discussion that reaches deep into the foundations of science in general. When you write "psychological data are never deterministic except when the distances are very large" is, on the face of it, nonsense unless you arbitrarily define error bounds such that the results fall outside of them. I could do the very same thing with any theory and show that it was futile to even measure the length of a table or kilo of sugar. You must means something I'm not getting. As to the question of "unit of psychological measurement" I really don't know what you mean. What is the unit of physical measurement? How is that answerable in absence of a scale? Or are you talking about some Fechnerian ideas such as JND's? I'm quite mystified by what you are saying here, but would like to understand.

I took a brief look at the first chapter of your thesis and from it I see that you are concerned with issues of maintaining the same scale across measurements. Indeed! I may read more. One thing caught my eye which relates directly to what we are talking about, namely invariances. Ratio invariance is what is needed to get ratio scales. That is testible upto the error of estimation and is a the heart of Naren's (1996) commutative property, Luce's (2002) proportion commuativity and both are quite well sustained in audition.

We come to the same topic from very different perspectives, so I don't mean to dispute that the claims you state have been made. It is just a matter of discussing them from the different perspctives. I am enjoying the discussion in any case.

I too enjoy the discussion. We should be careful though to not confuse perspective with the issue unless you hold the philosophical view that there is no truth, only perspective. As far as I can tell, you come to the issue of measurement from the psychometric POV. In my latest comments, I've tried to express my awareness of the psychometric literature especially as it relates to error in measurement and because you seem to put much stock in Michell's views, I attempted to show that his view of the main discussion are not in any contradiction to the points I've made. In this context, I will close by pointing out that whether one assumes error is a nuisance or in need of being modelled is not a discussion that has an end at this junction. Some axiomatic models have been formulated in a stochastic mannter (e.g. Falmagne for testing double cancellation) but neither can rest on measurement foundations that differ, nor can that foundation differ from any other science.

By the way, I forgot to say -- you should write an article on axiomatic psychophysics if you are able. There was nothing on Rasch measurement before I added articles. Having articles depends on contributors like you who know the area. So I hope you will contribute material in this area if you can. Regards Holon 11:10, 24 June 2006 (UTC)

Yes, I would like to do that, but I fear it will be a big job...

I look forward to your reply.

(Rutuag 06:40, 25 June 2006 (UTC))

I understand, but keep in mind you can always start with a brief description of some essentials. The point of Wikipedia is that it is constantly being developed. Hopefully, if you make a start, at some point you can add or someone else will pick up on it and add. Holon 06:34, 26 June 2006 (UTC)

I'll see if I get inspired :). In the meantime, I intend to make just a few changes in the entry we are discussing in response to our conversation here...I don't know when, though...

(Rutuag 05:27, 27 June 2006 (UTC))

[edit] Further discussion on measurement

You said "there is a subtle but fundamental point to be made here: measurement with numbers is simply the assignment of numbers according to a rule". This seemed to be a fairly unambiguous statement, but as long as you appreciate the deficiencies of the definition, we have no disagreement.

I can't think of a better way to cast the problems of this than by the following "The failure of measurement to "take" in cognition and psychometrics is related to a deep conceptual question concering the relationship between statistics, as away of describing randomness, and measurement, as a way of describing structure. The lack of an adequate theory for this relationship is, in reality, a weakness of both fields."

Rasch measurement models embody theoretical requirements for measurement stochastic form. The Rasch model is not a statistical model; it is a probabilistic measurement model. I’m not sure which fields Luce refers to above – statistics and axiomatic measurement? If so, I agree.

I mean no offense, but this seems to be a strawman: where has there been any such counfusion? Keeping this distinction in mind is utterly fundamental to all of my applied and pure research.

Precisely – I am very sceptical that anyone has achieved ratio measurement of perceptual phenomena, but I’m more than willing to explore claims to the contrary. Since you are suggesting – I would suggest Rasch is well worth a visit. Luce (1994) appears unaware of the links between Thurstone’s work and the Rasch model, and between the Rasch model and conjoint additivity? See http://www.rasch.org/memo24.htm on the latter. Surely he is not unaware of these links, since the BTL is a Rasch model (it has the properties that define Rasch models).

Michell (1999) writes, "...this prediction provides a specific test of the hypothesis that the attributes are quantitative...this test is called the Thomsen condition...a key condition in the theory of conjoint measurement...the important point is that a way, distinct from extensive measurement, had been specified whereby the hypothesis that an attribute has additive structure could be tested." (pp. 202-203). This is the same point as I attempted to make.

I have no disagreement. **update for clarification** The issue I have lies not with what is required for measurement, but what is required to demonstrate measurement has been successfully achieved. How it is tested. The main issue is that when data are stochastic, tests based on a deterministic framework are problematic. Above, I referred to deterministic cases as being trivial. It is hard to explain this point in depth. If you still want me to explain, I'll elaborate but I'll leave it for now.

As to the question of "unit of psychological measurement" I really don't know what you mean. What is the unit of physical measurement? How is that answerable in absence of a scale? Or are you talking about some Fechnerian ideas such as JND's? I'm quite mystified by what you are saying here, but would like to understand.

Units of physical measurement are defined in terms of precisely specified empirical conditions, which generally link to theory. For example, the definition of the metre in terms of the path of light traveling through a vacuum in a specific time interval, the kelvin in terms of a fraction of the triple point of water. My thesis is that units of psychological measurement must also be specified in terms of empirical conditions and other relevant quantitative attributes. I summarise in the conclusion of my thesis, linking to observations by Joel Michell. I believe this goes to the heart of Luce’s (1993, p. 127) stated goal “to prove qualitative theories underlying quantitative models that relate several variables”. In physics, units are only meaningfully defined in terms which relate more than one kind of quantity. On your question about scale – it depends on your definition of scale. I define scale as a continuum partitioned into units of fixed magnitude of the relevant quantitative attribute, phenomenon, or relation. That is, a scale is defined in terms of continuity and a unit. I’m not referring to JNDs, no, though this is a related matter in the way I have approached the study of the unit.

Precisely. Double cancellation is basically the same thing as invariance as articulated by Rasch, though Rasch articulated it in the context of an empirical frame of reference for measurement and expounded the concept in depth in terms of physical laws (Rasch, 1977).

On a more light-hearted note, let me illustrate what I mean about units of measurement with a fictional story. There is a captain of a ship, and he is about to go to war. He wants to take as much coal as possible with him, but his men must travel 100 miles to get the coal. So he asks them a scientist to carefully measure how much mass the ship can take and then to also measure the coal. The scientist has the men load a large quantity of stone onto the ship. He has people judge many different quantities of stone against each other, and finds that is data meet the requirements of measurement. He then has someone travel and conduct similar experiments with the coal, and again he finds he can measure the coal. He reports back, saying that the measure of stone the ship can hold is 233, saying "we have also been able to measure the coal".

Captain: so then, how much coal can the ship hold?

Scientist: Sir, I don't know.

Captain: What do you mean you don't know, I asked you to measure how much mass the ship can hold and to measure the coal.

Scientist: Yes sir, we have done that.

Captain: Then what is the problem?

Scientist: Sir, there is know way of knowing what measurement of coal is equal to the measurement of stone the ship can hold.

The moral of the story is, I hope, obvious. This situation is simply ridiculous. Physical measurements are all but meaningless without knowledge of the unit. As Joel Michell says: "scientific measurement is properly defined as the estimation or discovery of the ratio of some magnitude of a quantitative attribute to a unit of the same attribute" (Michell, 1997, p. 358). See pp. 23-5 of my thesis for emphasis on how fundamental uits of measurement are to all of the physical sciences.

I must be very clear though, my view is that ignoring the importance of a unit has been a problem in measurement theory quite generally. Psychometrics is terrible in this respect as far as I'm concerned. Take care Holon 05:48, 28 June 2006 (UTC)

[edit] Question about CAT and dependent items

Hello, (I saw you were discussing psychometrics on the IRT-article) I'm currently doing my master thesis and its related to psychometrics. I'm currently examining a new test that uses computer adaptive testing. I have, however, stumbled upon some issues with the way the algorithm of the test works that makes a factor analysis approach chaotic and the approach I've used for getting a reliability is hard to understand. The latter is an approach I got from the test-creators. So, why do I put this here? Since you do psychometrics, maybe you know something about this issue that you can help me with. That is, point me in the direction of relevant literature or search terms etc. I'm looking for articles that describe how you can do a factor analysis even when you work with weighted items. that is, during the scoring of the test, all items gets an extra weight multiplied to their rawscore based on their overall importance decided by the scorer when he scores. So, items that are highly scores, increases, while low scored items are not that much improved. This weighting is hard to account for in a factor analysis. Do you know of any theory that discussesd weighted items, group dependent items, item dependen groups, item dependency etc? Because this is a real challenge for me... I know the information I have given you is sketchy, but maybe you know literature that touches the area...

thank you for your time. Thomasrm 14:43, 4 October 2006 (GMT1)

Could you send me an email using the wiki email function? I'm flat out right now but will get back to you soon as possible. Holon 04:16, 5 October 2006 (UTC)

[edit] Rubrics for assessment

Hi Holon, some time ago you posted your comments & suggestions on assessment, in particular the development of more rigorous rubrics and carefully designed examples. I didn't forget this suggestion, I was simply too busy to work on that issue at the time. We now have over 10% (and growing!) of the English Wikipedia assessed via this scheme, so it's getting to be pretty important! I also think we'll start seeing press stories mentioning it, so I'd like the system to be able to stand up to public scrutiny, even if it is a fairly rough scheme. Things on WP should lighten up for me in a week or two. Would you be willing to work on this with me? Please let me know, thanks again for sharing your ideas, Walkerma 05:35, 9 November 2006 (UTC)

Walkerma, very happy to work on it with you. To look at it in a couple of weeks or so would be better for me also. I had a brief look at the rubric and examples. The fact there are examples of each classification is good. In the general scheme of things, this is a pretty good effort at a rubric if reasonably coarse classification is all that is needed. Perhaps, when you have time, you could summarise any contentious issues. I'll try to read through the discussion though. To allow the clearest possible classifications, scaling examples with a series of comparisons of pairs of articles would really help, as I said. Whether this is worthwhile depends on the purpose and the 'stakes'. But yeah, not a problem even if you just want to think through whether it is already good, or whether a little tweaking would help. Feel free to email me using the wiki function. Cheers -- Holon 11:40, 9 November 2006 (UTC)

Great! There haven't been any contentious issue with the quality, but I think a well thought-out set of examples would be good. The "stakes" are quite high just because soon most of the English Wikipedia will be referenced (for quality assessment) to these few examples. (The importance criterion has been much more contentious, because some people get very upset when told their favourite topic is unimportant!) I'll contact you in a fortnight or so. Thanks, Walkerma 07:21, 10 November 2006 (UTC)

Retrieved from "http://en.wikipedia.org../../../h/o/l/User_talk%7EHolon_98ff.html"

User talk:Holon

From Wikipedia, the free encyclopedia

Contents

[edit] Welcome from Redwolf24

[edit] Rasch models

[edit] Re Georg Rasch

[edit] Comment added to your user page

[edit] Regarding changes to Stevens' Power Law

[edit] Further discussion on measurement

[edit] Question about CAT and dependent items

[edit] Rubrics for assessment

Views

Navigation

interaction

Search