net.jxta.search.util
Class TermCountService
java.lang.Object
|
+--net.jxta.search.util.TermCountService
- public class TermCountService
- extends java.lang.Object
Used to compute relevance of search results for opensearch.
Method Summary |
void |
clearDebug()
|
static void |
compute_counts_bm(java.lang.String[] queryArray,
java.lang.String[] docArray,
int[][] TermCount,
int[] DocCount)
Computes raw stats for relevance routine using
BM method from net.jxta.search.java.util.BM. |
static void |
compute_counts_naive(java.lang.String[] queryArray,
java.lang.String[] docArray,
int[][] TermCount,
int[] DocCount)
Compute counts using naive method. |
static void |
main(java.lang.String[] argv)
|
static boolean |
quick_match(java.lang.String word,
java.lang.String q)
quick_match performs a match between "word" and query "q"
and returns true if there is a match
and false if there is no match, note that case is ignored. |
static java.lang.String |
quick_stem(java.lang.String orig_string)
quick_stem purports to implement fast, simple stemming using
common suffixes. |
void |
setDebug()
|
static boolean |
test(java.io.PrintStream ps)
By having each class have a test() method, we can
later wrap all of the testing in a class that hits
each test method in succession. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TermCountService
public TermCountService()
setDebug
public void setDebug()
clearDebug
public void clearDebug()
compute_counts_bm
public static void compute_counts_bm(java.lang.String[] queryArray,
java.lang.String[] docArray,
int[][] TermCount,
int[] DocCount)
- Computes raw stats for relevance routine using
BM method from net.jxta.search.java.util.BM.
NOTE: No range or array checking is done in this
method. All the parameters are assumed to be initialized
correctly.
- Parameters:
String
- queryArray is an array strings each of
which is a term in the query string.String
- DocArray is an array of documents, each
entry of which is a string which is a document.int
- [][] TermCountint
- [] DocCount
compute_counts_naive
public static void compute_counts_naive(java.lang.String[] queryArray,
java.lang.String[] docArray,
int[][] TermCount,
int[] DocCount)
- Compute counts using naive method.
quick_stem
public static java.lang.String quick_stem(java.lang.String orig_string)
- quick_stem purports to implement fast, simple stemming using
common suffixes. The method assumes (?) that the parameter
has already been converted to upper case.
- Parameters:
String
- orig_string is a term which is assumed to
converted to upper case.- Returns:
- String stem_string is supposed to the parameter with
one of the common suffixes stemmed away.
FIXME: Somewhere, there should be a reference in the literature
for how this method works:
reference.
FIXME: This method appears to have a bug insofar as stemming
*ING returns *IN, *ER returns *E, etc.
quick_match
public static boolean quick_match(java.lang.String word,
java.lang.String q)
- quick_match performs a match between "word" and query "q"
and returns true if there is a match
and false if there is no match, note that case is ignored.
- Parameters:
String
- word. Whitespace matters in this
matching function, so word " foo" will not match
query "foo".String
- q is a query term. Same conditions on whitespace
hold as for word.- Returns:
- boolean true for case-insensitive match, false otherwise.
FIXME: There is a second part to the function besides the
naive match. This needs to be documented.
test
public static boolean test(java.io.PrintStream ps)
- By having each class have a test() method, we can
later wrap all of the testing in a class that hits
each test method in succession.
TODO: Everything goes to System.out at the moment.
It would be smart to be able to specify some arbitrary
PrintStream for future automatic testing.
main
public static void main(java.lang.String[] argv)