TEXTCRUNCHER XTRA HELP: METHODS DOCUMENTATION | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TC_Register(registrationCodeString) - No return. The demo version of TextCruncher Xtra will display a trial-version alert the first time you use a method other than TC_Register. If you have purchased the TextCruncher Xtra you received a registration number. Call the register method like so: TC_Register("48dkd2929") -- your registration number is the string inside the parentheses parameter
Example: TC_Register("4812345")
SEARCH COMMANDS
FindFirst(sourceString,findString) - where sourceString is the string to search in and findString is the string to look for. Returns an integer character position in sourceString of the first occurence of findString or 0 if the string was not found or if there was an error Finds the first occurrence of findString in sourceString and returns the character position of its first character in sourceString. Not affected by SetPosition. Always searches from character 1 of sourceString. After FindFirst the current position is the return value of FindFirst. Example: put findFirst("abcdefg","c") -- 3
FindNext(sourceString,findString) - where sourceString is the string to search in and findString is the string to look for. Returns and integer character position in sourceString of the occurence of findString on or after the current search position or 0 if the string was not found or if there was an error. Finds the first occurrence of findString in sourceString after the current position and returns the character position of its first character in sourceString. Current position may have been set by a previous Find or Replace or by the SetPosition command. After FindNext the current position is the return value of FindNext. If there is no occurence after the current position, FindNext returns 0, which sets the current positon to 0, therefore causing any subsequent FindNext to start at the beginning again. Example: set source = "When you select a category, the filters in that category appear in a list." put FindFirst(source,"category") -- 19 put FindNext(source,"category") -- 49 setPosition(40) put FindNext(source,"category") -- 49
FindPrevious(sourceString,findString) - where sourceString is the string to search in and findString is the string to look for. Returns an integer character position in sourceString of the occurence of findString before the current search position or 0 if the string was not found or if there was an error. Finds the first occurrence of findString in sourceString before the current position and returns the character position of its first character in sourceString. Current position may have been set by a previous Find or Replace or by the SetPosition command. After FindPrevious the current position is the return value of FindPrevious. If there is no occurence before the current position, FindPrevious returns 0, which sets the current positon to 0. Example: set source = "When you select a category, the filters in that category appear in a list." setPosition(40) put FindPrevious(source,"category") -- 19
FindAll(sourceString,findString) - where sourceString is the string to search in and findString is the string to look for. Returns a list containing the starting character positions of all occurrences of findString in sourceString or empty list if the string was not found or if there was an error. Finds all occurrences of findString in sourceString and returns a list of the character positions of each found string in sourceString. Not affected by SetPosition. Always searches from character 1 of sourceString. Does not affect current position. Example: set source = "When you select a category, the filters in that category appear in a list." put FindAll(source,"category") -- [19, 49]
CHARACTER POSITION The return values of the Find commands would not be very useful unless you had some way of fetching the found word back out of the Director text chunk you were searching. These commands, in tandem with the Find commands can be used to do things like highlight the found word in a Director field or display the found word in context with the other words on the same line.
GetWordOfCharPosition(sourceString,characterNumber) - where sourceString is the string to search in and characterNumber is the integer character position in the string. Returns an integer word number in sourceString of word containing the specified character position or 0 if no word contains the character position or if there was an error. Converts the character position returned by one of the Finds to a word number in the text chunk. Example: on showWord fieldName,theWord -- Finds a word in a field and highlights it -- set charPos = FindNext(field fieldName,theWord) if TC_GetLastError() = 0 then set wordNum = GetWordOfCharPosition(field fieldName,charPos) if wordNum > 0 then hilite word wordNum of field fieldName end if end if end
GetLineOfCharPosition(sourceString,characterNumber) - where sourceString is the string to search in and characterNumber is the integer character position in the string. Returns an integer line number in sourceString of line containing the specified character position or 0 if no line contains the character position or if there was an error. Returns 0 if the character in the specified position is a line ending character - RETURN (13), LINEFEED (10). Converts the character position returned by one of the Finds to a line number in the text chunk. Example: on showContext fieldName,theWord,contextFieldName -- Finds the line in a field a word appears in -- and displays the whole line of text in another -- field -- set charPos = FindNext(field fieldName,theWord) if TC_GetLastError() = 0 then set lineNum = GetLineOfCharPosition(field fieldName,charPos) if lineNum > 0 then put line lineNum of field fieldName into field contextFieldName end if end if end
GetItemOfCharPosition(sourceString,characterNumber,itemDelimiterASCIICode) - where sourceString is the string to search in, characterNumber is the integer character position in the string and itemDelimiterASCIICode is the ASCII value (charToNum) of the character to use for the item delimiter. Returns an integer item number in sourceString of item containing the specified character position or 0 if no item contains the character position or if there was an error. Converts the character position returned by one of the Finds to an item number in the text chunk. Example: on getDataFieldContainingWord dataRecord,theWord -- Find word in tab-delimited data record and -- return data field containing word -- set charPos = FindFirst(dataRecord,theWord) if TC_GetLastError() = 0 then set dataFieldNum = GetItemOfCharPosition(dataRecord,charPos,9) if dataFieldNum > 0 then set the itemDelimiter = numToChar(9) return item dataFieldNum of dataRecord else return "" end if else return TC_ErrorCodeToString(TC_GetLastError()) end if end
REPLACE The following commands find a specified character or string in text and replace it with another. The original string remains unchanged. A copy of the string with the changes is returned. You can perform multiple replaces on the same text by passing the return from one replace to the next like so: set source = "The SRP for MS Word is $299.00" set source = ReplaceAll(source, "suggested retail price","SRP") set source = ReplaceAll(source, "Microsoft","MS") put source -- "The suggested retail price for Microsoft Word is $299.00"
ReplaceFirst(sourceString,replaceString,findString) - where sourceString is the string to search in, replaceString is the string to replace the found string with and findString is the string to look for. Returns a copy of the source string modified by the replace operation or "" if there was an error. Finds the first occurrence of findString in the source string and replaces it with replaceString. Not affected by SetPosition. Always searches from character 1 of sourceString. After ReplaceFirst the current position is the characterposition after the end of the replaced string. This command ignores the current case-sensitivity setting set by SetCaseSensitivity for high-ASCII (above numtochar(125)) characters. For high-ASCII characters it is always case-sensitive. See (deprecated in D11) for more information on case sensitivity.
Example 1: set source = "aaabbb" put replacefirst(source,"xxx","aaa") -- "xxxbbb" put getposition() -- 4 Example 2: on tabsToSpaces memName -- Replace TAB indents at the beginnings of lines -- with 5 spaces instead, in a text member -- set temp = the text of member memName set li = the number of lines of temp repeat with x = 1 to li set thisLine = line x of temp if char 1 of thisLine = TAB then set thisLine = ReplaceFirst(thisLine," ",TAB) put thisLine into line x of temp end if end repeat set the text of member memName = temp end
ReplaceNext(sourceString,replaceString,findString) - where sourceString is the string to search in, replaceString is the string to replace the found string with and findString is the string to look for. Returns a copy of the source string modified by the replace operation or "" if there was an error. Finds the first occurrence of findString in sourceString on or after the current position and returns the character position of its first character in sourceString. Current position may have been set by a previous Find or Replace or by the SetPosition command. After ReplaceNext the current position is the character position after the end of the replaced string. If ReplaceNext goes past the end of the source string it will return 0 and set position back to 0, which causes a subsequent ReplaceNext to wrap around and start at the beginning of the string. This command ignores the current case-sensitivity setting set by SetCaseSensitivity for high-ASCII (above numtochar(125) ) characters. For high-ASCII characters it is always case-sensitive. See (deprecated in D11) for more information on case sensitivity. Example: set source = "aaa cat bbb cat ccc cat " set source = ReplaceFirst(source,"dog","cat") put source -- "aaa dog bbb cat ccc cat " set source = ReplaceNext(source,"dog","cat") put source -- "aaa dog bbb dog ccc cat " put getposition() -- 16
ReplaceAll(sourceString,replaceString,findString) - where sourceString is the string to search in, replaceString is the string to replace the found string with and findString is the string to look for. Returns a copy of the source string modified by the replace operation or "" if there was an error. Finds all occurrences of findString in sourceString and replaces each occurrence with replaceString. Not affected by SetPosition. Always searches from character 1 of sourceString. Does not affect current position.
Example 1: set source = "aaa cat bbb cat ccc cat " set source = ReplaceAll(source,"dog","cat") put source -- "aaa dog bbb dog ccc dog "
Example 2: on fixLineEndings convertToWhichType -- Converts Mac format text file to PC or -- PC to Mac. Utility handler. No error -- checking built in. -- -- EX: fixLineEndings("PC") -- Converts Mac file to PC format. Prompts -- for file to convert. -- set y = new(xtra "fileio") set theFilePath = displayOpen(y) if theFilePath <> "" then openFile(y,theFilePath,1) set temp = readFile(y) delete(y) closeFile(y) set PCending = RETURN & numToChar(10) set MacEnding = RETURN if convertToWhichType = "PC" then set temp = ReplaceAll(temp,PCending,MacEnding) else set temp = ReplaceAll(temp,MacEnding,PCending) end if createFile(y,theFilePath) openFile(y,theFilePath,2) writeString(y,temp) closeFile(y) end if end
SEARCH/REPLACE PROPERTIES The following properties affect the operation of the Find and Replace commands.
SetPosition(characterNumber) - where characterNumber is the integer character number to reposition search start to. No return. JavaScript Note: This command requires a small Lingo script to work in JavaScript. Position is the current character position that TextCruncher has been set to. You use SetPosition to set the current character position manually. The following TextCruncher commands also affect the current position: FindFirst FindNext FindPrevious ReplaceFirst ReplaceNext
Some TextCruncher commands use the current position as their starting point: FindNext starts searching from the character after the current position. FindPrevious starts searching from the character before the current position ReplaceNext starts searching from and including the current position
TextCruncher does not associate position with any particular string you are searching. You should either use FindFirst or ReplaceFirst to find the first occurrence in a new string or reset the position to 0 manually between operations on different strings to avoid starting a search in the middle of the string instead of at the beginning.
Example 1: -- Position affects commands differently set source = "aaabbbccc" setPosition(4) put FindNext(source,"bbb") -- 0 setPosition(4) put FindPrevious(source,"bbb") -- 0 setPosition(4) put ReplaceNext(source,"xxx","bbb") -- "aaaxxxccc"
Example 2 -- One operation can set position -- so that another operation will -- not start at the beginning of -- the next string -- set source = "aaabbbccc" put FindFirst(source,"bbb") -- 4 -- Position is now set to 4 set source = "Do not pass go." put FindNext(source,"Do") -- 0 -- The operation started at position 4, -- so it missed the word at pos 1. -- Should have used FindFirst on the -- new string.
GetPosition( ) - Returns the integer character number where the next search will start. Returns the current character position. JavaScript note: This command requires a small Lingo script to work in JavaScript. Example: set source = "Do not pass go." put FindFirst(source,"not") -- 4 put GetPosition() -- 4
SetCaseSensitivity(onOrOff) - where onOrOff is the boolean value, either 1 (TRUE) to consider case or 0 (FALSE) to ignore case. No return. Determines whether or not the Find and Replace commands will consider case. The default is FALSE - ignore case. Example: SetCaseSensitivity(FALSE) set source = "Cart the cart over here." put FindFirst(source,"cart") -- 1 SetCaseSensitivity(TRUE) put FindFirst(source,"cart") -- 10
INDEXING Use the list commands to index Director text chunks for operations where you otherwise would have to refer back to the text itself. For instance you can use GetListOfWords to speed up a proximity search (word 1 within so many words of word 2). List commands like: if getAt(wordList,5) = secondword
are faster than text chunk commands like:
if word 5 of field "whatever" = secondword
GetListOfWords(sourceString) - where sourceString is the string to operate on. Returns a list of character chunks delimited by white space or empty list if there was an error. Returns a list of character chunks in the string delimited by white space. White space includes TAB(9), Linefeed(10) and Return(13). Example: on makeIndex thestring -- Create an alphabetical index of the words -- in a text chunk -- set index = [:] set thestring = ToLowerCase(thestring) set wordList = GetListOfWords(thestring) if TC_GetLastError() = 0 then set numWords = count(wordList) repeat with w = 1 to numWords set thisword = getAt(wordList,w) if voidP(getaprop(index,thisword)) then addprop index,thisword,list(w) else append getprop(index,thisword),w end if end repeat sort index end if return index end
GetListOfLines(sourceString) - where sourceString is the string to operate on. Returns a list of lines delimited by either CR only (Mac) or CR/LF (PC) or empty list if there was an error. Creates a list of the lines contained in sourceString. Example: on readPCFile filePath -- Automatically strips off linefeeds and puts lines into -- a list -- set f = new(xtra "fileio") openFile(f,filePath,1) set filetext = readFile(f) closeFile(f) set lineList = GetListOfLines(filetext) return lineList end
GetListOfItems(sourceString,itemDelimiterASCIICode) - where sourceString is the string to operate on, itemDelimiterASCIICode is the ASCII value (charToNum) of character to use for the item delimiter. Returns a list of items delimited by the itemDelimiterASCIICode specified or empty list if there was an error. Divides a string into a list of pieces delimited by any character. Get the ASCII code needed by using chartonum() on any typeable character in the message window or refer to the ASCII chart. Note: A string that begins with the specified item delimiter will create an empty item at the beginning of the list. The old TextCruncher did not create an empty item. This change was made to keep TC consistent with Director's item handling and also to preserve information that would otherwise be lost. For instance, if the string was a data record, the empty item would signify an empty data field and should be retained.
Example 1: set phonenum = "891-752-3344" put chartonum("-") -- 45 set phonepieces = GetListOfItems(phonenum,45) put phonepieces -- ["891", "752", "3344"] set areacode = getAt(phonepieces,1) put areacode -- "891"
Example 2: on readInDataFile theFilePath, keyFieldNumber -- Reads in tab-delimited data file and converts it -- to a list indexed on the specified key field -- Code assumes that record fields are delimited by -- tabs, records are delimited by either CR or CR/LF, -- and that the data fields themselves do not contain -- CR or LF characters. -- - Created list looks like this: -- -- ["IN000789":["Vargas","Paul","PART987","Ballpeen hammer","5.00"], -- "IN000790":["Jones","Marty","PART002","Switchplate","1.97"] ] -- set f = new(xtra "fileio") openFile(f,theFilePath,1) set source = readFile(f) closeFile(f) set indexedList = [:] set lineList = GetListOfLines(source) if TC_GetLastError() = 0 then repeat with oneLine in lineList if oneLine = "" then next repeat set fieldList = GetListOfItems(oneLine,9) set key = getAt(fieldList,keyFieldNumber) if not voidP(getaprop(indexedList,key)) then alert "Duplicate key: " & key else deleteAt fieldList,keyFieldNumber addProp indexedList,key,duplicate(fieldList) end if end repeat end if return indexedList end
CASE These operations use the Mac Standard Roman character set on Mac and PC ANSI on PC. Some decorative or shareware fonts put non-standard characters in empty or little-used character positions, which can cause unexpected results when characters above 127 are cased. Check the character set crossmap against the font before reporting a casing bug. These methods return a modified copy of the source string, leaving the original string unchanged. See (deprecated in D11) for more information on character encoding changes.
ToUpperCase(sourceString) - where sourceString is the string to operate on. Returns a copy of the source string uppercased or "" if there was an error. Replaces any character in the string with its uppercase counterpart if there is one. Leaves the character unchanged if there is no uppercase character for it. Uses Mac Standard Roman character set on Mac and PC ANSI on PC. This method is deprecated in D11. Example: set source = "uppercase this!" put ToUpperCase(source) -- "UPPERCASE THIS!"
ToLowerCase(sourceString) - where sourceString is the string to operate on. Returns a copy of the source string lowercased or "" if there was an error. Replaces any character in the string with its lowercase counterpart if there is one. Leaves the character unchanged if there is no lowercase character for it. This method is deprecated in D11. Example: set source = "pUT $50,000 iN UnMARked BIlLs IN tHe BRieFcaSE." set source = ToLowerCase(source) put ToUpperCase(char 1 of source) into char 1 of source put source -- "Put $50,000 in unmarked bills in the briefcase."
FORMATTING The following commands break text up into lines of a specified length and align the text on each line to the left, center or right margin. They are most useful for formatting plain text that will display in a monospaced font such as Courier. Lines created by wrapping are delimited with the Mac line endings (CR only). A single word in the text that is longer than the wrap length will not be broken, and will therefore cause that line to be longer than the wrap length. These methods return a modified copy of the source string, leaving the original string unchanged.
HardWrapText(sourceString,charsPerLine) - where sourceString is the string to operate on and charsPerLine is the maximum number of characters per line. Returns a copy of the source string reformatted or "" if there was an error. Aligns text to left margin and breaks any line longer than the specified length limit with RETURN. Example: set ruler = RETURN & "123456789012345678901234567890" & RETURN set source = "The quick brown fox jumped over the lazy dog." set wrapped = HardWrapText(source,20) put ruler & wrapped -- " 123456789012345678901234567890 The quick brown fox jumped over the lazy dog."
HardCenterText(sourceString,charsPerLine) - where sourceString is the string to operate on and charsPerLine is the maximum number of characters per line. Returns a copy of the source string reformatted or "" if there was an error. Breaks any line longer than the specified length limit with RETURN. Centers text shorter than the specified line length by padding it on the left with spaces. Example: set ruler = RETURN & "123456789012345678901234567890" & RETURN set source = "The quick brown fox jumped over the lazy dog." set wrapped = HardCenterText(source,30) put ruler & wrapped -- " 123456789012345678901234567890 The quick brown fox jumped over the lazy dog."
HardAlignTextRight(sourceString,charsPerLine) - where sourceString is the string to operate on and charsPerLine is the maximum number of characters per line. Returns a copy of the source string reformatted or "" if there was an error. Breaks any line longer than the specified length limit with RETURN. Aligns text shorter than the specified line length to the right margin by padding it on the left with spaces. Not currently supported under Mac OSX. Example: set ruler = RETURN & "123456789012345678901234567890" & RETURN set source = "The quick brown fox jumped over the lazy dog." set wrapped = HardAlignTextRight(source,30) put ruler & wrapped -- "
URL ENCODING The following commands encode/decode punctuation characters in the lower ASCII range (< 128) and all characters in the high ASCII range, to make text that can be transmitted over the internet. These methods return a modified copy of the source string, leaving the original string unchanged. The encoding conforms to RFC1738 for low ASCII characters. For characters 128 and above, which are not covered by RFC1738, the hex value of the corresponding character in the ISO 8859-1 Latin-1 character set is used.
Huh? The content of a URL cannot contain characters that have special meaning within the URL like ":" or "/" or the browser will interpret the URL incorrectly. The convention for including such characters in a URL is to represent them with a percent sign followed by a two character hex value. The same format of encoding is used in e-mail messages to transmit 8-bit characters ( > 128 ). TextCruncher's URL encoding converts text to a format that can be used in a URL, sent to a CGI script, or transmitted in e-mail successfully.
ASCII characters (Decimal 0 - 127) These characters are the same on the Mac and the PC. TextCruncher's URLencoding encodes all punctuation in this character range and all non-printable characters. Spaces are converted to %20 rather than +, although + is used by many browsers. Encoding with %20 rather than + enables the converted text to be used by CGI scripts and e-mail as well as browsers. Browsers that recognize + for space will recognize %20 as well.
8-bit characters "high ASCII" (Decimal 128 - 255) ISO 8859-1 Latin1 is the current web standard character set but it does not agree exactly with either Mac Standard Roman or ANSI, the western standards on Mac and PC. TextCruncher maps both Mac and PC characters to the corresponding ISO 8859-1 Latin1 character, if there is one, which is the same convention used for escaping 8-bit characters in HTML. Both Mac and PC character sets contain characters that do not have any ISO counterpart. Those are the tinted characters in the 8-bit character crossmap chart below. The ANSI characters tinted green in the chart, have no corresponding characters in Latin1, but are still encoded and decoded by many browsers using their ANSI values. These ANSI characters and their Mac counterparts are encoded by TextCruncher using the ANSI value. Mac characters with no ISO counterpart and no counterpart in the "green" ANSI range, like the Mac apple (240) are encoded as %3F (?). ANSI positions, like 128, that contain no character are also encoded as ?. Since the Mac-only characters (pink with no matching character in green) will not be preserved, it's a good idea to stay away from them in text that is destined to be URL-encoded. Keep in mind that because the character values after 127 represent different characters on the Mac and on the PC, most character positions above 127 will URLencode differently depending on whether TextCruncher is running on a Mac or PC. For example, numToChar(168) is the registered trademark symbol on the Mac, so it encodes to %AE (decimal 174), which is its character code in Latin1.On the PC, numToChar(174) is the registered trademark so that character position is what encodes to %AE.
TC_URLEncode(sourceString) - where sourceString is the string to operate on. Returns a copy of the source string encoded or "" if there was an error. Hex-encodes the punctuation and 8-bit characters in a string. This method is deprecated in D11. Example: set source = "This is an encoded string." put TC_URLEncode(source) -- "This%20is%20an%20encoded%20string%2e"
TC_URLDecode(sourceString) - where sourceString is the string to operate on. Returns a copy of the source string decoded or "" if there was an error. Converts hex-encoding sequences in a string back to single-characters. This method is deprecated in D11. Example: set source = "This%20is%20an%20decoded%20string%2e" put TC_URLDecode(source) -- "This is an decoded string."
ERROR REPORTING The previous version of TextCruncher could return error values or data from the same call. The current version returns only data from calls with return values. If there is an error that prevents the call from returning data, the closest thing to nothing, but in the same data type, is returned. For instance a call that normally returns a Lingo list will return an empty list if there is an error. A call that normally returns a string will return an empty string if there is an error. This method of error handling insures that the same data type will always be returned from a call, so it enables the Lingo programmer to write more generic error handlers. Instead of examining the return value to determine if an error occurred, use TC_GetLastError() directly after making any other call you want to monitor status for. The following chart lists the possible return values from TC_GetLastError()
TC_GetLastError( ) - Returns 0 for no error or a negative number if an error occurred. Returns the status of the last TextCruncher call made. Returns status for all calls other than TC_GetLastError itself or TC_ErrorCodeToString. Example: set foundList = FindAll(source,"madras") set err = TC_GetLastError() if err = 0 then --- continue normal operation else if err = -10 then -- Inform user about error user can -- do something about alert "Not enough memory to continue" end if
TC_ErrorCodeToString(errorCodeNumber) - where errorCodeNumber is the negative integer returned from TC_GetLastError. Returns the string description of the error code. Returns the text description of the specified error code. Example: on showError which alert(TC_ErrorCodeToString(which)) end |