1 Overview
2 Self Corrections JSON
- 2.1 JSON Breakdown
3 3 use cases for self-corrections
- 3.1 1. Repetition
- 3.2 Examples:
  - 3.2.1 Higher confidence repeated word:
    - 3.2.1.1 JSON excerpt:
  - 3.2.2 Shorter duration repeated word:
    - 3.2.2.1 JSON excerpt:
- 3.3 2. Phonetically similar words
- 3.4 Example:
  - 3.4.1 JSON excerpt:
- 3.5 3. Incorrect tense
- 3.6 Example:
  - 3.6.1 JSON excerpt:
4 Calculations

Overview

Self-corrections are a key metric scored in running records and observed oral reading fluency (ORF) assessments.

Definitions of “self-corrections” vary. For example, for DIBELS, a widely-adopted system for assessing students’ literacy skills, a self-correction is when “a student makes an error but corrects it within three seconds.”

SoapBox definition:
A self-correction is an immediate correction of an incorrectly read word (aka “reparandum”).

Version 1 Note:

V1 self-corrections are ‘one-word corrections’ i.e. the correction directly adjacent to the error and data returned is for one word.

There are three criteria of self corrections:

Repetition
Phonetically similar words
Incorrect tense

Self Corrections JSON

The following keys are visible in the JSON to indicate if a potential self correction was detected.

...
	"sub_types": {
		"self_correction": {
			"reparandums": [2]
		}
	},
	"transcription_details": {
		"time_since_previous": 0.54
	},
	“type": "CORRECT"
…

Key	Description

Key	Description
time_since_previous	Indicates how far (in seconds) a word is from the previous one
self_correction	If a potential self correction was detected a `self_correction` object is added under JSON key sub_types.
reparandums	Under the `self_correction` object is `reparandums` which contains the list of text_score indexes (only 1 in V1) to the insertions that are corrected by a self_corrected word.

JSON Breakdown

The following is an example of the JSON structure you can expect from Fluency. In the example below,
the audio file contains i like tripe stripes and the reference text contains i like stripes.

Reference Text	i like stripes
Child says	“i like tripe stripes”

So, the student is attempting the word stripes but said a very similar word tripe and then self corrects to stripes.

The following is a snippet with some details omitted for ease of reading. It shows the text_score object for tripe and the text_score object for stripes. tripe has a type INSERTION and stripes has a type CORRECT, but under sub_types for stripes is self_correction which points to the text_score index of tripe (which is 2).

{
	"end": 9.69,
	"start": 9.06,
	"normalised_word": "",
	"reference_index": 1,
	"reference_word": "",
	"sub_types": {},
	"transcription_details": {
		"time_since_previous": 0.11
	},
	"transcription_index": 2,
	"transcription_word": "tripe",
	"type": "INSERTION"
}, {
	"end": 10.95,
	"start": 10.23,
	"normalised_word": "stripes",
	"reference_index": 2,
	"reference_word": "stripes",
	"sub_types": {
		"self_correction": {
			"reparandums": [2]
		}
	},
	"transcription_details": {
		"time_since_previous": 0.54
	},
	"transcription_index": 3,
	"transcription_word": "stripes",
	"type": "CORRECT"
}

3 use cases for self-corrections

Our self-corrections feature supports three key use cases. They have been robustly validated with customers to ensure they address the majority of commonly made self-corrections in ORF.

1. Repetition

This use case improves our current repetition feature to account for self-corrections.

Our voice engine now examines a word confidence score and duration to differentiate a self-correction from an insertion. It looks for a pattern where a word is said with low confidence or high duration and is then followed immediately by a high-confidence or shorter duration version of the word.

If the repeated word has 10% or higher confidence than the first time it was said, the second word is considered a self-correction of the first word.

If the repeated word has 30%+ shorter duration than the first time it was said, the second word is considered a self-correction of the first word, even if the confidence is similar for both words.

This use case accounts for situations where a child takes a long time to utter a word (e.g., elongating the letters in a word, likely as an attempt to decode (sound out) the word without fully stopping speaking) and then says the word again more fluently.

Examples:

Higher confidence repeated word:

A student is reading a passage. When they get to the word zig, they decode it and have poor articulation on the final part of the word. The student then repeats “zig” correctly.

JSON excerpt:

…
{
"normalised_word": "",
"reference_index": 16,
"reference_word": "",
"sub_types": {},
"transcription_details": {
"confidence": 17.302678346634,
"end": 19.89046875,
"phone_breakdown": [],
"start": 19.41,
"time_since_previous": 0.03
},
"transcription_index": 16,
"transcription_word": "zig",
"type": "INSERTION"
}, {
"normalised_word": "zig",
"reference_index": 16,
"reference_word": "zig",
"sub_types": {
"self_correction": {
"reparandums": [17]
}
},
"transcription_details": {
"confidence": 94.16164431,
"end": 20.73,
"phone_breakdown": [],
"start": 19.89046875,
"time_since_previous": 0
},
"transcription_index": 17,
"transcription_word": "zig",
"type": "CORRECT"
},
…

Shorter duration repeated word:

The student continues reading. When they get to the word Phonzy, they slowly sound it out. Then they repeat “Phonzy” faster and more fluently.

JSON excerpt:

  …
          {
	    "normalised_word": "",
	    "reference_index": 28,
	    "reference_word": "",
   	    "sub_types": {},
  	    "transcription_details": {
		"confidence": 92.1654463,
		"end": 33.46,
		"phone_breakdown": [],
		"start": 32.53,
		"time_since_previous": 0.06
	},
	    "transcription_index": 30,
	    "transcription_word": "phonzy",
	    "type": "INSERTION"
	},{
	   "normalised_word": "phonzy",
	   "reference_index": 28,
	   "reference_word": "phonzy",
               "sub_types": {
		"self_correction": {
			"reparandums": [30]
		}
	      },
	   "transcription_details": {
		"confidence": 100,
		"end": 33.91,
		"phone_breakdown": [],
		"start": 33.46,
		"time_since_previous": 0.0
	    },
	    "transcription_index": 31,
	    "transcription_word": "phonzy",
	    "type": "CORRECT"
	}
           …

2. Phonetically similar words

For this use case, the SoapBox voice engine looks for a pattern where a student reads a word incorrectly, but the incorrect word and correct word are phonetically similar, and the student self-corrects to the correct word.

Example:

In a passage, a student reads the word shed as “she.” They then self-correct to “shed.”

JSON excerpt:

 …
          {
	    "normalised_word": "",
	    "reference_index": 7,
	    "reference_word": "",
   	    "sub_types": {},
  	    "transcription_details": {
		"confidence": 97.302678346634,
		"end": 9.891,
		"phone_breakdown": [],
		"start": 9.41,
		"time_since_previous": 0.03
	},
	    "transcription_index": 7,
	    "transcription_word": "she",
	    "type": "INSERTION"
	}, {
	    "normalised_word": "shed",
	    "reference_index": 7,
	    "reference_word": "shed",
	    "sub_types": {
		"self_correction": {
			"reparandums": [7]
		}
	      },
	      "transcription_details": {
		"confidence": 100,
		"end": 10.73,
		"phone_breakdown": [],
		"start": 9.891,
		"time_since_previous": 0
	     },
	    "transcription_index": 8,
	    "transcription_word": "shed",
	    "type": "CORRECT"
	},
           …

3. Incorrect tense

Our voice engine can also flag as a self-correction when the child has said the wrong tense of a verb (e.g., “walk” vs. “walked” vs. “walking”).

Example:

In a passage, a student reads the word bounding instead of “bounded.” They then self-correct to “bounded.”

JSON excerpt:

…          {
	    "normalised_word": "",
	    "reference_index": 11,
	    "reference_word": "",
   	    "sub_types": {},
  	    "transcription_details": {
		"confidence": 97.741717100143,
		"end": 4.97734771,
		"phone_breakdown": [],
		"start": 4.68,
		"time_since_previous": 0
	},
	    "transcription_index": 11,
	    "transcription_word": "bounding",
	    "type": "INSERTION"
	}, {
	    "normalised_word": "bounded",
	    "reference_index": 11,
	    "reference_word": "bounded",
	    "sub_types": {
		"self_correction": {
			"reparandums": [11]
		}
	      },
	      "transcription_details": {
		"confidence": 100,
		"end": 5.04,
		"phone_breakdown": [],
		"start": 4.97734771,
		"time_since_previous": 0
	     },
	    "transcription_index": 12,
	    "transcription_word": "bounded",
	    "type": "CORRECT"
	},
           …

Calculations

A user can use self corrections to improve the accuracy of their calculations, if they wish to ignore reparandums as mistakes and not penalize a user for them, then they can calculate the number of reparandums made by doing a search of the text_score array and calculating the number of reparandums or reparandum_count.

With this reparandum_count they can get the true_insertion_count which is the following:

true_insertion_count = insertion_count - reparandum_count

More detailed information can be found here:

Fluency - Example Calculations

SoapBox Labs Knowledgebase

Fluency - Self Corrections Feature

Overview

Self Corrections JSON

JSON Breakdown

3 use cases for self-corrections

1. Repetition

Examples:

Higher confidence repeated word:

JSON excerpt:

Shorter duration repeated word:

JSON excerpt:

2. Phonetically similar words

Example:

JSON excerpt:

3. Incorrect tense

Example:

JSON excerpt:

Calculations