Click here to Skip to main content
15,895,084 members
Home / Discussions / Web Development
   

Web Development

 
AnswerRe: How send email with attachment???? Pin
Member 154211923-Feb-05 1:55
Member 154211923-Feb-05 1:55 
Questionwhat means proxy-connection:? Pin
ThinkingPrometheus18-Feb-05 9:06
ThinkingPrometheus18-Feb-05 9:06 
GeneralHiding HTML controls using Javascript Pin
hitu_kapadia18-Feb-05 3:08
hitu_kapadia18-Feb-05 3:08 
GeneralRe: Hiding HTML controls using Javascript Pin
Qaiser.Muhammad18-Feb-05 14:24
Qaiser.Muhammad18-Feb-05 14:24 
GeneralRe: Hiding HTML controls using Javascript Pin
hitu_kapadia21-Feb-05 2:54
hitu_kapadia21-Feb-05 2:54 
Generalplug-in in IE Pin
Pauwl18-Feb-05 2:48
Pauwl18-Feb-05 2:48 
Questionhow to set tomcat4.1 to run the java class with larger heap size Pin
kinkei17-Feb-05 20:41
kinkei17-Feb-05 20:41 
Questionhow to invoke a java class with 6 parameters within jsp which need 800M heap size Pin
kinkei17-Feb-05 20:07
kinkei17-Feb-05 20:07 
i have just begun to learned jsp and java recently, so have the question below:

Situation:
My boss asked me to do a project for doing online data mining.
He requested the thing below:

1. Write a webpage by jsp with some textfield or pull down meun to get the parameter for getting the opinion and the parameter needed for the java class.
2. When the user click the buttom in the website, then the java class have to be invoked.

Questions:
1. ask i put the .java in the server, i can simply use cmd and type the following command:
java -Xmx800M P05context 1 1 3 6 F01xyz.txt F10xyz.txt

which F01xyz is the prepared text file for the java class to read and the F10xyz.txt is the output file generated by the P05context.class

but i don't know how to invoke the P05context.class by jsp

My boss have suggested me two ways to do this:

a. write a java virtual class to execute the java program.
but i don't know how to write.
i think conceptualy will be like this :
public class run_java<br />
  {<br />
    public void run java(parameter1,parameter2....,parameter6)<br />
    {<br />
      //run the cmd command <br />
      //java -Xmx800M P05context 1 1 3 6 F01xyz.txt F10xyz.txt<br />
    }<br />
  }


b. write a javabean in the jsp. and i have try to use this code(at the end of the message:
but the tomcat server reply a internal error to me

how cani do ?

the jsp file-->
------------------------------------------------------------------------------

<jsp:useBean id="test" scope="session" class ="P05context" /><br />
<html><br />
<head></head><br />
<body><br />
<% test.P05context(1,1,3,6,F01xyz.txt,F10xyz.txt); %><br />
</body><br />
</html>

------------------------------------------------------------------------------

the java class -->
------------------------------------------------------------------------------
/* P05context.java - Performs text mining.<br />
 * George Smith 2004-05-02 <br />
 * 2004-05-16 Complete re-write using one separate TermPairMap instead of a <br />
 *             TermPairMap per term. Structure changed completely. <br />
 * 2004-05-19 WriteVector re-written for better speed.<br />
 */<br />
<br />
import java.text.DecimalFormat;<br />
import java.util.Date;<br />
import java.io.IOException;<br />
import java.io.BufferedReader;<br />
import java.io.FileReader;<br />
import java.util.StringTokenizer;<br />
import java.util.ArrayList;<br />
import java.util.HashSet;<br />
import java.util.HashMap;<br />
import java.util.Iterator;<br />
import java.util.Collections;<br />
import java.util.Comparator;<br />
import java.io.PrintWriter;<br />
import java.io.BufferedWriter;<br />
import java.io.FileWriter;<br />
 <br />
/**<br />
 * An application for text mining using the EMI or Proximity methods.<br />
 *  <br />
 * @author George Smith<br />
 */<br />
public class P05context {<br />
// *****************************************************************************<br />
//                Class fields.<br />
// *****************************************************************************<br />
   /** Switch for showing progress reports. */<br />
   private final static boolean REPORT_PROGRESS = false;<br />
   /** Conversion factor for log to base 2. */<br />
   private final static float log2Factor = (float)( 1 / Math.log(2));<br />
   /** The OS dependent end of line string. */<br />
   private final static String ls = System.getProperty("line.separator");<br />
   /** The help message. */<br />
   private final static String helpMessageP05context =<br />
      "P05context Usage:" + ls +<br />
      "    java P05context m x y n F01.txt F10.txt" + ls +<br />
      "where " + ls +<br />
      "    m is the mining method: 1 = EMI, 2 = Proximity." + ls +<br />
      "    x is the number of consecutive words to construct a term." + ls +<br />
      "    y is the window size in words." + ls +<br />
      "    n is the number of defining terms in a context vector." + ls +<br />
      "    F01.txt is the input documents file." + ls +<br />
      "    F10.txt is the output context vectors file.";<br />
<br />
// *****************************************************************************<br />
//                Instance Variables.<br />
// *****************************************************************************<br />
   /* The data storage for this consists of<br />
   * A primary HashMap called termOneMap which has keys of (String) term,<br />
   * and values of (TermValue)( (int) count + (HashMap) columnMap ).<br />
   * The matrix is contained in the pairData objects.<br />
   * Each columnMap is a HashMap of term / (MutableInt) count.<br />
   * The matrix is a full matrix not a half diagonal matrix. This takes more<br />
   * memory, but not twice as much, and gives much faster access after loading.<br />
   * Initial creation takes longer but updating time is not increased because<br />
   * matrix data is in MutableIntegers which are referenced from two columnMaps<br />
   * (for term1 and term2) but required only a single change per update.<br />
   */<br />
   <br />
   /** Used for measuring the elapsed run time. */<br />
   private ElapsedRunTime elapsedRunTime = new ElapsedRunTime();<br />
<br />
   /** <br />
    * This map contains one entry per term. Each entry consistas of:- <br />
    * The key, a String containing the term.<br />
    * The value, a TermMapValue which has term data and a reference to the key.<br />
    * An Collection of TermMapValue can be obtained for sorting, etc.<br />
    */ <br />
   private TermMap termMap;<br />
   <br />
   /**<br />
    * This specialized map contains one entry per term pair. Each holds:-<br />
    * String key1 and String key2. Both together form the key.<br />
    * int count and float score. Both make up the value.<br />
    */<br />
   private TermPairMap termPairMap;<br />
<br />
   /** Used for checking used and free memory. */      <br />
   private MemoryInfo memory;<br />
<br />
   /** Accumulates the console report data and then writes it to console. */<br />
   private ConsoleReport consoleReport;<br />
<br />
   private final boolean emiMethod;<br />
   private final boolean proxMethod;<br />
   private final int termSize;<br />
   private final int windowSize;<br />
   private final int vectorSize;<br />
   private final String docsF01file;<br />
   private final String contextF10file;<br />
<br />
   private int windowsScanned;<br />
<br />
// *****************************************************************************<br />
// method         public static void main(String[] args)<br />
// *****************************************************************************<br />
   /**<br />
    * Calls the constructor using the command line arguments and <br />
    * if any exceptions are thrown catches them and prints a message.<br />
    * @param args The command line arguments.<br />
    */<br />
   public static void main(String[] args) {<br />
<br />
      try {<br />
         new P05context(args);      <br />
      } catch (Exception e) {<br />
         System.out.println(ls + "*** Error ***");<br />
         String str = e.toString();<br />
         int colonAt = str.indexOf(":");<br />
         if (colonAt < 0) {<br />
            e.printStackTrace();<br />
<br />
         } else {<br />
            int startAt = str.substring(0,colonAt).lastIndexOf(".") + 1;<br />
            System.out.println(str.substring(startAt));<br />
            System.out.println(ls + helpMessageP05context);<br />
//            e.printStackTrace(); // Use during debugging.<br />
         }<br />
      }<br />
   } // ******  end of  main  **************************************************<br />
<br />
   // **************************************************************************<br />
   // constructor    P05context(String[] args)<br />
   // **************************************************************************<br />
   /**<br />
    * The constructor requires command line arguments.<br />
    * It automatically runs the whole application.<br />
    * @param args The command line parameters.<br />
    */<br />
   public P05context(String[] args) throws IOException {<br />
<br />
      // Read all command line variables and do sanity checks.<br />
      if (args.length != 6) {<br />
         throw new IllegalArgumentException(<br />
                                 "Wrong number of command line parameters.");<br />
      }<br />
      int methodID = Integer.parseInt(args[0]);<br />
      if (methodID == 1) {<br />
         emiMethod = true;<br />
         proxMethod = false;<br />
      } else if (methodID == 2) {<br />
         emiMethod = false;<br />
         proxMethod = true;<br />
      } else {<br />
         emiMethod = false;<br />
         proxMethod = false;<br />
         throw new IllegalArgumentException("Invalid method type.");<br />
      }<br />
      termSize = Integer.parseInt(args[1]);<br />
      windowSize = Integer.parseInt(args[2]);<br />
      vectorSize = Integer.parseInt(args[3]);<br />
      if (termSize < 1 || windowSize < 1 || vectorSize < 1) {<br />
         throw new IllegalArgumentException(<br />
                                          "x, y, and n must all be positive.");<br />
      }<br />
      if (windowSize < termSize) {<br />
         throw new IllegalArgumentException(<br />
                           "Window size must be at least as big as term size.");<br />
      }<br />
      docsF01file = args[4];<br />
      contextF10file = args[5];<br />
<br />
      // Create the two key data maps and other objects.<br />
      termMap = new TermMap();<br />
      termPairMap = new TermPairMap();<br />
      memory = new MemoryInfo();<br />
      consoleReport = new ConsoleReport();<br />
<br />
      windowsScanned = 0;<br />
<br />
      // Runthe whole process.<br />
      runP05context();<br />
   }<br />
<br />
   /**<br />
    * Runs the application.<br />
    * @throws IOException<br />
    */<br />
   private void runP05context() throws IOException {<br />
      <br />
      // Run the Context program.<br />
         if (REPORT_PROGRESS) {reportProgress("Start prescanAllDocuments");} <br />
      prescanAllDocuments(docsF01file);<br />
         if (REPORT_PROGRESS) {reportProgress("Start processAllDocuments");} <br />
      processAllDocuments(docsF01file);<br />
      if (emiMethod) {<br />
         reportProgress("Start writeVectorsEMI    ");<br />
         writeVectorsEMI();<br />
      } else if (proxMethod) {<br />
         reportProgress("Start writeVectorsProx   ");<br />
         writeVectorsProx();<br />
      }<br />
      <br />
      if (REPORT_PROGRESS) {reportProgress("Finished                 ");} <br />
         consoleReport.printStatisticsSummary();<br />
   }<br />
<br />
<br />
   /**<br />
    * Used with proximity method to copy TermPair scores to TermValues.<br />
    * Uses an anonymous TermPairMap.ForEachEntry class with a run method<br />
    * to cause  transferOneScoreToTermValues to run for each TermPair entry.<br />
    */<br />
   private void transferAllScoresToTermValues() {<br />
      termPairMap.run(new TermPairMap.ForEachEntry() {<br />
         public boolean run(TermPairMap.Entry entry) {<br />
            return transferOneScoreToTermValues(entry);<br />
         }<br />
      });<br />
   }<br />
<br />
   /**<br />
    * Used with proximity method to copy TermPair scores to TermValues.<br />
    * The TermPair score os transferred only if it is larger than the <br />
    * existing TermValue score.<br />
    */<br />
   private boolean transferOneScoreToTermValues(TermPairMap.Entry entry) {<br />
      TermValue tv1 = (TermValue) (entry.key1);<br />
      TermValue tv2 = (TermValue) (entry.key2);<br />
     <br />
      float score = entry.score;<br />
      if (tv1.score < score) {<br />
          tv1.score = score;<br />
      }<br />
      if (tv2.score < score) {<br />
          tv2.score = score;<br />
      }<br />
      return true;<br />
   }<br />
<br />
   /**<br />
    * Calculates logs to base 2.<br />
    * @param n The number to find the log of.<br />
    * @return The log to base 2.<br />
    */<br />
   private float log2(float n) {<br />
      return  (float) Math.log(n) * log2Factor; <br />
   }<br />
<br />
<br />
   /** <br />
    * Used by emi method to compute all emi values.<br />
    * Uses  an anonymous TermPairMap.ForEachEntry to run computeEMI on<br />
    * every TermPairMap entry in turn.<br />
    */ <br />
   private void computeAllEMIs() {<br />
      termPairMap.run(new TermPairMap.ForEachEntry() {<br />
         public boolean run(TermPairMap.Entry entry) {<br />
            return computeEMI(entry);<br />
         }<br />
      });<br />
   }<br />
   <br />
   /** <br />
    * Commputes the emi score for one TermPair entry. It puts the result in <br />
    * TermPairMap.score and if the value is larger than the score already in<br />
    * either of the two associated TermValue.score it updates these also.<br />
    *   <br />
    * @param entry The TermPair.Entry to calculate for.<br />
    * @return Always returns true.<br />
    */<br />
   private boolean computeEMI(TermPairMap.Entry entry) {<br />
      TermValue tv1 = (TermValue) (entry.key1);<br />
      TermValue tv2 = (TermValue) (entry.key2);<br />
<br />
      // No permitted to have the same term for tv1 and tv2.<br />
      if (tv1.term == tv2.term) {<br />
         entry.score = 0;<br />
         return true;<br />
      }<br />
<br />
      float t1 = tv1.count / (float) this.windowsScanned;<br />
      float t2 = tv2.count / (float) this.windowsScanned;<br />
      float t1t2 = entry.count / (float) this.windowsScanned;<br />
<br />
      float emi = t1t2 * log2((t1t2 / (t1 * t2)) + 1)<br />
                  + (1 - t1 - t2 + t1t2) <br />
                        * log2((1 - t1 - t2 + t1t2) / ((1 - t1)*(1 - t2)) + 1); <br />
<br />
      // Set termMap.score if necessary, for both terms.<br />
      if (tv1.score < emi) {<br />
          tv1.score = emi;<br />
      }<br />
      if (tv2.score < emi) {<br />
          tv2.score = emi;<br />
      }<br />
<br />
      // Set score to emi.<br />
      entry.score = emi;<br />
      return true;<br />
   }<br />
<br />
   /**<br />
    * Computes a "half emi" for when there is no joinf term1, term2 existing.<br />
    * @param t1 The probability of term 1.<br />
    * @param t2 The probability of term 2.<br />
    * @return The emi score.<br />
    */<br />
   private float computeHalfEMI(float t1, float t2) {<br />
      return (1 - t1 - t2) * log2((1 - t1 - t2) / ((1 - t1)*(1 - t2)) + 1); <br />
   }<br />
<br />
   /**<br />
    * Writes the proximity score to F10.txt for proximity method.<br />
    * The sequence of actions (using similar numbering to writeVectorsEMI) is:-<br />
    * <br />
    * 1. Sorting records into descending maximum score order by:-<br />
    * (A) Get a TermValueList to allow easy access by index (inTVList).<br />
    * (F) Go through all termPairMap records, calculate score. If the score is<br />
    *     larger for a term than at (E) update that term's score.<br />
    * (G) Sort inTVList on descending score. This sets the output record order. <br />
    * (H) Go through termMap and set count to -1; <br />
    * (I) Index through the inTVList to get the TermValues to output.<br />
    *<br />
    * 2. Writing output records requires many scans through the termPairMap. <br />
    *    This is very time consuming. To reduce scan time break the termPairMap<br />
    *    into smaller chunkMap parts (one chunkMap at a time). Each chunkMap<br />
    *    holds the TermPair data for the "rows" and "columns" corresponding<br />
    *    to a group of terms (eg 1,000 terms for a 1,000,000 term data set).<br />
    *    To do this the TermMap count field is used as a selection marker:-<br />
    *  (A) Scan through all terms in termMap, marking some with an index.<br />
    *  (B) Using termPairMap.ForEach select indexes of termPairMap entries which<br />
    *      are marked in termMap into an IndexListList.<br />
    *  (C) Index through the chunk processing one term at a time (see 3).<br />
    *  (D) Remove the indices from termMap.<br />
    * <br />
    * 3. Process one term at a time into the output file:-<br />
    * (A) Create an ArrayList for TermValues for output (outTVList).<br />
    * (B) Sequentially use the indexes in inTVList to get termPairMap indexes<br />
    *     from indexListList and for each TermPair put the associated TermValues<br />
    *     into outTVList.<br />
    * (D) Sort the outTVList on descending score. This orders the output fields.<br />
    * (E) Write the top n terms in outTVList to output.   <br />
    * (F) Clear all indexes from the scores in outputTVList.<br />
    * <br />
    * @ throws IOException<br />
    */<br />
   public void writeVectorsProx() throws IOException {<br />
      int recordsWritten = 0;<br />
      DecimalFormat df5 = new DecimalFormat("0.00000");<br />
      PrintWriter out = new PrintWriter(new BufferedWriter(<br />
                                          new FileWriter(contextF10file)));<br />
      consoleReport.fileF10 = contextF10file;<br />
      <br />
      // 1(A) Get a list of terms which can be accessed by indexing.<br />
      // Contains an ArrayList of references to all TermValue in termMap.<br />
      TermValueList inTVList = termMap.getTermValueList();<br />
<br />
      // 1(F) Transfer scores to TermValues in termMap.<br />
      transferAllScoresToTermValues();<br />
<br />
      // 1(G) Sort the inTVList<br />
      inTVList.sort(new TermValueDescendingScore());<br />
<br />
      // Prepare inTVList for use of count as extraction marker indexes.<br />
      for (int i = 0; i < inTVList.size(); i++ ) {<br />
         TermValue tv = inTVList.getTermValue(i); <br />
         tv.count = -1;<br />
      }<br />
<br />
      // A TermValue used to transfer the term String to anonymous classes.<br />
      final TermValue test = new TermValue("", 0);<br />
      <br />
      // 1(H)Bite off chunks of inTVList to process. Terms in chunk = sqrt(size/2)<br />
//      int chunkSize = (int) (2 * Math.sqrt(inTVList.size()) + 16);<br />
      int chunkSize = (int) (inTVList.size() / 10 + 1000);<br />
<br />
      // Setup an IndexListList for holding chunk TermPairMap indexes.<br />
      final IndexListList indexListList = new IndexListList(chunkSize);<br />
         <br />
      // Now work through data (chunk at a time).<br />
      for (int start = 0; start < inTVList.size(); start += chunkSize) {<br />
         int end = start + chunkSize < inTVList.size() ?<br />
                                          start + chunkSize : inTVList.size();<br />
<br />
<br />
         // Mark the termMap's TermValues for selection using count.<br />
         for (int index = 0; start + index < end; index++ ) {<br />
            inTVList.getTermValue(start + index).count = index;<br />
         }<br />
<br />
         // Extract the termPairMap indexes.<br />
         termPairMap.run(new TermPairMap.ForEachWithIndex() {<br />
            public boolean run(Object key1, Object key2, <br />
                                                int count, float score, int i) {<br />
               return indexListList.extractIndexes(key1, key2, count, score, i);<br />
            }<br />
         });<br />
<br />
         // Reset the termMap's TermValue's count used for selection.<br />
         for (int index = 0; start + index < end; index++ ) {<br />
            inTVList.getTermValue(start + index).count = -1;<br />
         }<br />
<br />
          // Index through this chunk, (one term at a time).                  <br />
         for (int listIndex = 0; start + listIndex < end; listIndex++ ) {<br />
<br />
             // 3(A) outTVList will contain the list of TermValues to be output.<br />
             ArrayList outTVList = new ArrayList();<br />
      <br />
            // Get the TermValue we are interested in.<br />
            TermValue tv1 = inTVList.getTermValue(start + listIndex);<br />
<br />
            // 3(B) Work through the termPairMap entries for this tv1.<br />
            for (int j = 0; j < indexListList.index(listIndex).size(); j++ ) {<br />
<br />
               int tpmIndex = indexListList.index(listIndex).get(j);<br />
<br />
               // Put the TermValues refereced in the termPairMap into outTVList.<br />
               // The other TermValue could be either of key1 or key2.<br />
               TermValue tpmTV1 = (TermValue) termPairMap.getKey1(tpmIndex);<br />
               TermValue tpmTV2 = (TermValue) termPairMap.getKey2(tpmIndex);<br />
               int iScore = (int)(termPairMap.getScore(tpmIndex) * 100000 + .5);<br />
               if (tpmTV1 == tv1) {<br />
                  tpmTV2.count = iScore; <br />
                  outTVList.add(tpmTV2);<br />
               } else if (tpmTV2 == tv1) {<br />
                  tpmTV1.count = iScore; <br />
                  outTVList.add(tpmTV1);<br />
               } else {<br />
                  throw new AssertionError("Bad tpm selection");<br />
               }<br />
            }<br />
<br />
            // 3(D) Sort the outTVList on descending score.<br />
            Collections.sort(outTVList, new TermValueDescendingCount());         <br />
            <br />
            // 3(E) Write the top n terms to output.   <br />
            int z = outTVList.size() < vectorSize ?<br />
                                                 outTVList.size() : vectorSize;<br />
            out.print(tv1.term);<br />
            for (int k = 0; k < z; k++ ) {<br />
               TermValue tv2 = ((TermValue) outTVList.get(k));<br />
               float fScore = (float) tv2.count / 100000;<br />
               out.print(" (" + tv2.term + " " + df5.format(fScore) + ")");<br />
            }<br />
            out.println();<br />
            recordsWritten++;<br />
       <br />
            // 3(F) Clear all indexes in the outTVList count fields.<br />
            for (int j = 0; j < outTVList.size(); j++ ) {<br />
               ((TermValue) outTVList.get(j)).count = -1;<br />
            }<br />
         }<br />
         <br />
         // Clear indexListList preparatory to the next chunk load.<br />
         indexListList.clear();<br />
      }<br />
      out.close();<br />
      consoleReport.recordsWrittenF10 = recordsWritten;      <br />
   }<br />
<br />
   /**<br />
    * Writes the EMI score to F10.txt for emi method.<br />
    * The sequece of actions is:-<br />
    * <br />
    * 1. Sorting records into descending maximum score order by:-<br />
    * (A) Get a TermValueList to allow easy access by index (inTVList).<br />
    * (B) Sort this list into descending count order (opposite of score).<br />
    * (C) Select the first n terms of lowest count and hold them (topTVList).<br />
    * (D) Pick the one term with lowest count. <br />
    * (E) Go through all other reecords using the count of (D) to calculate a <br />
    *     half emi for each (save as score). This emi will be each term's <br />
    *     score if there is no higher emi from a TermPair calculation.<br />
    *     Leave the score of the record with lowest count as 0.  <br />
    * (F) Go through all termPairMap entries, calculate score. If the score is<br />
    *     larger for a term than at (E) update that term's score.<br />
    * (G) Sort inTVList on descending score. This sets the output record order.<br />
    * (H) Go through termMap, put count probablity in score and set count to -1. <br />
    * (I) Index through the inTVList to get the TermValues to output.<br />
    *<br />
    * 2. Writing output records requires many scans through the termPairMap. <br />
    *    This is very time consuming. To reduce scan time break the termPairMap<br />
    *    into smaller chunkMap parts (one chunkMap at a time). Each chunkMap<br />
    *    holds the termPair data for the "rows" and "columns" corresponding<br />
    *    to a group of terms (eg 1,000 terms for a 1,000,000 term data set).<br />
    *    To do this the termMap score field is used as a selection marker:-<br />
    * <br />
    *  (A) Scan through all terms in termMap, marking some with an index.<br />
    *  (B) Using termPairMap.ForEach select indexes of termPairMap entries which<br />
    *      are marked in termMap into an IndexListList.<br />
    *  (C) Index through the chunk processing one term at a time (see 3).<br />
    *  (D) Remove the indices from termMap.<br />
    * <br />
    * 3. Process one term into the output file:-<br />
    * (A) Create an ArrayList for TermValues for output (outTVList).<br />
    * (B) Sequentially use the indexes in inTVList to get termPairMap indexes<br />
    *     from indexListList and for each TermPair put the associated TermValues<br />
    *     into outTVList.<br />
    * (C) Scan the topTVList. Compute each halfEMI. If the TermValue score has<br />
    *     a value which is lower, overwrite it. If there is no value put the<br />
    *     topTVList entry into outputTVList with the calculated score.<br />
    * (D) Sort the outTVList on descending score. This orders the output fields.<br />
    * (E) Write the top n terms in outTVList to output.   <br />
    * (F) Clear all indexes from the scores in outputTVList.<br />
    */<br />
   public void writeVectorsEMI() throws IOException{<br />
<br />
      int recordsWritten = 0;<br />
      DecimalFormat df5 = new DecimalFormat("0.00000");<br />
      PrintWriter out = new PrintWriter(new BufferedWriter(<br />
                                          new FileWriter(contextF10file)));<br />
      consoleReport.fileF10 = contextF10file;<br />
      <br />
      // 1(A)(B) Get a list of terms which can be accessed by indexing.<br />
      // Contains an ArrayList of references to all TermValue in termMap.<br />
      TermValueList inTVList = termMap.getTermValueList();<br />
<br />
      // An array of references to the n lowest count TermValue in inTVList.<br />
      TermValue[] topTVList = new TermValue[0];<br />
      // The inverse of the count of total windows scanned.<br />
      float rWinCnt = 1 /  (float) windowsScanned;;<br />
      // Probability of the single lowest count TermValue inTVList.<br />
      float t1;<br />
      <br />
<br />
      // 1(B) Sort inTVList into ascending count order.<br />
      inTVList.sort(new TermValueAscendingCount());<br />
<br />
      // 1(C) Save the top n terms (lowest emi) for later use.<br />
      int topTVListSize = vectorSize + 1 < inTVList.size() ? <br />
                                                vectorSize + 1 : inTVList.size();<br />
      // An array of references to the n lowest count TermValue in inTVList.<br />
      topTVList = new TermValue[topTVListSize];<br />
      for (int i = 0; i < topTVListSize; i++ ) {<br />
         topTVList[i] = inTVList.getTermValue(i);<br />
      }<br />
<br />
      // 1(D) Probability of the single lowest count TermValue in inTVList.<br />
      t1 = inTVList.getTermValue(0).count * rWinCnt;<br />
<br />
      // 1(E) Go through all the inTVList calculating their halfEmiScore.<br />
      for (int i = 1; i < inTVList.size(); i++ ) {<br />
         TermValue tv = inTVList.getTermValue(i);<br />
         tv.score = computeHalfEMI(t1, tv.count * rWinCnt);<br />
      }<br />
<br />
      // 1(F) Now go through all emi termPairMap.<br />
      computeAllEMIs();<br />
<br />
      // 1(G) Sort the inTVList<br />
      inTVList.sort(new TermValueDescendingScore());<br />
<br />
      // Prepare inTVList for use of count as extraction marker indexes.<br />
      for (int i = 0; i < inTVList.size(); i++ ) {<br />
         TermValue tv = inTVList.getTermValue(i); <br />
         tv.score = tv.count * rWinCnt;<br />
         tv.count = -1;<br />
      }<br />
<br />
      // A TermValue used to transfer the term String to anonymous classes.<br />
      final TermValue test = new TermValue("", 0);<br />
      <br />
      // 1(H)Bite off chunks of inTVList to process. Terms in chunk = sqrt(size/2)<br />
//      int chunkSize = (int) (2 * Math.sqrt(inTVList.size()) + 16);<br />
      int chunkSize = (int) (inTVList.size() / 10 + 1000);<br />
      <br />
<br />
      // Setup an IndexListList for holding chunk TermPairMap indexes.<br />
      final IndexListList indexListList = new IndexListList(chunkSize);<br />
         <br />
      // Now work through data (chunk at a time).<br />
      for (int start = 0; start < inTVList.size(); start += chunkSize) {<br />
         int end = start + chunkSize < inTVList.size() ?<br />
                                          start + chunkSize : inTVList.size();<br />
<br />
         // 2(A) Mark the termMap's TermValues for selection using count.<br />
         for (int index = 0; start + index < end; index++ ) {<br />
            inTVList.getTermValue(start + index).count = index;<br />
         }<br />
<br />
         // 2(B) Extract the termPairMap indexes.<br />
         termPairMap.run(new TermPairMap.ForEachWithIndex() {<br />
            public boolean run(Object key1, Object key2, <br />
                                                int count, float score, int i) {<br />
               return indexListList.extractIndexes(key1, key2, count, score, i);<br />
            }<br />
         });<br />
<br />
         // Reset the termMap's TermValue's count used for selection.<br />
         for (int index = 0; start + index < end; index++ ) {<br />
            inTVList.getTermValue(start + index).count = -1;<br />
         }<br />
<br />
<br />
          // 2(C) Index through this chunk, (one term at a time).                  <br />
         for (int listIndex = 0; start + listIndex < end; listIndex++ ) {<br />
<br />
             // 3(A) outTVList will contain the list of TermValues to be output.<br />
             ArrayList outTVList = new ArrayList();<br />
      <br />
            // Get the TermValue we are interested in.<br />
            TermValue tv1 = inTVList.getTermValue(start + listIndex);<br />
<br />
            // 3(B) Work through the termPairMap entries for this tv1.<br />
            for (int j = 0; j < indexListList.index(listIndex).size(); j++ ) {<br />
<br />
               int tpmIndex = indexListList.index(listIndex).get(j);<br />
<br />
               // Put the TermValues refereced in the termPairMap into outTVList<br />
               // The other TermValue could be either of key1 or key2.<br />
               TermValue tpmTV1 = (TermValue) termPairMap.getKey1(tpmIndex);<br />
               TermValue tpmTV2 = (TermValue) termPairMap.getKey2(tpmIndex);<br />
               int iScore = (int)(termPairMap.getScore(tpmIndex) * 100000 + .5);<br />
               if (tpmTV1 == tv1) {<br />
                  tpmTV2.count = iScore; <br />
                  outTVList.add(tpmTV2);<br />
               } else if (tpmTV2 == tv1) {<br />
                  tpmTV1.count = iScore; <br />
                  outTVList.add(tpmTV1);<br />
               } else {<br />
                  throw new AssertionError("Bad tpm selection");<br />
               }<br />
            }<br />
<br />
            // 3(C) Compute halfEMIs for topTVList and update outTVList.<br />
            for (int j = 0; j < topTVList.length; j++ ) {<br />
               if (topTVList[j] == tv1) {<br />
                  continue;<br />
               }<br />
               float halfEMI = computeHalfEMI(tv1.score, topTVList[j].score);<br />
               int iScore = (int) (halfEMI * 100000 + .5);<br />
               if (topTVList[j].count < 0) {<br />
                  // This TermValue is not in outTVList. Add it.<br />
                  topTVList[j].count = iScore;<br />
                  outTVList.add(topTVList[j]);            <br />
               } else {<br />
                  if (topTVList[j].count < iScore) {<br />
                     topTVList[j].count = iScore;<br />
                  } // else no action required.<br />
               }<br />
            }<br />
<br />
            // 3(D) Sort the outputTVList on descending score.<br />
            Collections.sort(outTVList, new TermValueDescendingCount());         <br />
            <br />
            // 3(E) Write the top n terms to output.   <br />
                <br />
            int z = outTVList.size() < vectorSize ? <br />
                                                outTVList.size() : vectorSize;<br />
            out.print(tv1.term);<br />
            for (int k = 0; k < z; k++ ) {<br />
               TermValue tv2 = ((TermValue) outTVList.get(k));<br />
               float fScore = (float) tv2.count / 100000;            <br />
               out.print(" (" + tv2.term + " " + df5.format(fScore) + ")");<br />
            }<br />
            out.println();<br />
            recordsWritten++;<br />
       <br />
            // 3(F) Clear all counts in the outputTVList.<br />
            for (int j = 0; j < outTVList.size(); j++ ) {<br />
               ((TermValue)outTVList.get(j)).count = -1;<br />
            }<br />
         }<br />
         indexListList.clear();<br />
      }<br />
      out.close();<br />
      consoleReport.recordsWrittenF10 = recordsWritten;      <br />
   }<br />
<br />
   // **************************************************************************<br />
   //  class          TermValueList<br />
   // **************************************************************************<br />
   /**<br />
    * The ArrayList holds references to TermValue data from TermMap.<br />
    * The TermValueList is unsorted if the simple constructor is called, or<br />
    * will be sorted if the constructor includes an appropriate Comparator.  <br />
    * <br />
    * @author George Smith<br />
    */<br />
   private class TermValueList {<br />
         <br />
      /** The TfidfEntryList data. */ <br />
      private ArrayList list;<br />
<br />
      /**<br />
       * Constructor with source map only.<br />
       * Produces an unsorted TermValueList.<br />
       * @param map The source HashMap.<br />
       */<br />
      public TermValueList(HashMap map) {<br />
         list = new ArrayList(map.values());<br />
      } <br />
<br />
      /**<br />
       * Constructor with source map and comparator for sorting.<br />
       * Produces a sorted TermValueList.<br />
       * @param map The source HashMap.<br />
       * @param comparator The Comparator used for sorting.<br />
       */<br />
      public TermValueList(HashMap map, Comparator comparator) {<br />
         list = new ArrayList(map.values());<br />
         Collections.sort(list, comparator); <br />
      }<br />
      <br />
      /** The no parameter constructor is not allowed. */<br />
      private TermValueList() {};<br />
<br />
      /** Clears the ArrayList to allow garbage collection. */<br />
      public void clear() {<br />
         list.clear();<br />
      }<br />
         <br />
      /** <br />
       * Returns the term at index i. <br />
       * @return term at index i. <br />
       */<br />
      public String getTerm(int i) {<br />
         return ((TermValue)list.get(i)).term;<br />
      }<br />
      <br />
      /** <br />
       * Returns the termValue at index i.<br />
       * @return The termValue at index i. <br />
       */<br />
      public TermValue getTermValue(int i) {<br />
         return (TermValue) list.get(i);<br />
      }   <br />
<br />
      /**<br />
       * Sorts the TermValueList. (Temporary additional memory is used.)<br />
       * @param comparator The comparator to use for the sort.<br />
       */<br />
      public void sort(Comparator comparator) {<br />
         Collections.sort(list, comparator);<br />
      } <br />
<br />
      /**<br />
       * Returns the number of TermMap entries listed.<br />
       * @return The number of TermMap entries listed.<br />
       */<br />
      public int size() {<br />
         return list.size();<br />
      }<br />
   } // *** end of inner class   TermValueList  ********************************<br />
<br />
   // **************************************************************************<br />
   //  class          TermMap<br />
   // **************************************************************************<br />
   /**<br />
    * This is one of the two fundamental data stores in F05context.<br />
    * It is composed of a HashMap which has String term as key and TermValue<br />
    * objects as value.<br />
    * <br />
    * @author George Smith<br />
    */<br />
   private class TermMap {<br />
<br />
      /** The primary storage HashMap. */<br />
      private HashMap map;<br />
<br />
      /** No parameter constructor. */<br />
      TermMap() {<br />
         map = new HashMap();<br />
      }<br />
<br />
      /** Constructor sets initial map capacity. */<br />
      TermMap(int initialCapacity) {<br />
         map = new HashMap(initialCapacity);<br />
      }<br />
<br />
      /**<br />
       * Returns the number of mappings of term to TermValue.<br />
       * @return The number of mappings of term to TermValue.<br />
       */<br />
      public int size() {<br />
         return map.size();<br />
      }<br />
<br />
      /**<br />
       * Returns the TermValue corresponding to term.<br />
       * @param term The term to find the TermValue for.<br />
       * @return The TermValue corresponding to term.<br />
       */<br />
      public TermValue get(String term) {<br />
         return (TermValue) map.get(term);<br />
      }<br />
<br />
      /**<br />
       * Increments the term's count (of windows in which it appears).<br />
       * @param term<br />
       * @param count<br />
       */<br />
      public void incCount(String term, int count) {<br />
         TermValue termValue = (TermValue) termMap.get(term);<br />
         if (termValue == null) {<br />
            map.put(term, new TermValue(term, count));<br />
         } else {<br />
            termValue.count += count;<br />
         }<br />
      }<br />
<br />
      /**<br />
       * Returns a new unordered TermValueList for this TermMap.<br />
       * @return A new onordered TermValueList for this TermMap.<br />
       */<br />
      public TermValueList getTermValueList() {<br />
         return new TermValueList(map);         <br />
      }<br />
   }<br />
<br />
   // **************************************************************************<br />
   //  class       TermValueDescendingScore<br />
   // **************************************************************************<br />
   /**<br />
    * This comparator can be used by Collections to sort a list of<br />
    * TermValue objects into descending score order.<br />
    * For equal maxScores the objects retain their original order.<br />
    * <br />
    * @author George Smith<br />
    */<br />
   private class TermValueDescendingScore implements Comparator {<br />
<br />
      // The compare method does the comparison. <br />
      public int compare(Object o1, Object o2) {<br />
         float maxScore1 = ((TermValue)o1).score;<br />
         float maxScore2 = ((TermValue)o2).score;<br />
         if ( maxScore1 < maxScore2 ) {<br />
            return +1; <br />
         } else if ( maxScore1 > maxScore2 ) {<br />
            return -1;<br />
         } else {<br />
            return 0;<br />
         }<br />
      }<br />
   } // *** end of inner class   TermValueDescendingScore  *********************<br />
         <br />
   // **************************************************************************<br />
   //  class       TermValueAscendingCount<br />
   // **************************************************************************<br />
   /**<br />
    * This comparator can be used by Collections to sort a list of<br />
    * TermMap Entry objects into ascending count order.<br />
    * For equal maxScores the objects retain their original order.<br />
    * <br />
    * @author George Smith<br />
    */<br />
   private class TermValueAscendingCount implements Comparator {<br />
<br />
      // The compare method does the comparison.<br />
      public int compare(Object o1, Object o2) {<br />
         int count1 = ((TermValue)o1).count;<br />
         int count2 = ((TermValue)o2).count;<br />
         if ( count1 > count2 ) {<br />
            return +1; <br />
         } else if ( count1 < count2 ) {<br />
            return -1;<br />
         } else {<br />
            return 0;<br />
         }<br />
      }<br />
   } // *** end of inner class   TermVakueAscendingCount  **********************<br />
         <br />
   // **************************************************************************<br />
   //  class       TermValueDescendingCount<br />
   // **************************************************************************<br />
   /**<br />
    * This comparator can be used by Collections to sort a list of<br />
    * TermMap Entry objects into descending count order.<br />
    * For equal maxScores the objects retain their original order.<br />
    * <br />
    * @author George Smith<br />
    */<br />
   private class TermValueDescendingCount implements Comparator {<br />
<br />
      // The compare method does the comparison.<br />
      public int compare(Object o1, Object o2) {<br />
         int count1 = ((TermValue)o1).count;<br />
         int count2 = ((TermValue)o2).count;<br />
         if ( count1 > count2 ) {<br />
            return -1; <br />
         } else if ( count1 < count2 ) {<br />
            return +1;<br />
         } else {<br />
            return 0;<br />
         }<br />
      }<br />
   } // *** end of inner class   TermDescendingCount  **************************<br />
         <br />
   // **************************************************************************<br />
   //        inner class     TermValue<br />
   // **************************************************************************<br />
   /**<br />
    * The value objects in the termMap HashMap.<br />
    * There is a one-to-one relationship between TermValue and its term.<br />
    * @author George Smith<br />
    */<br />
   public class TermValue {<br />
<br />
      /** Reference to TermKey term. */<br />
      String term;<br />
<br />
      /** Used for emi calculations and separately for indexing to termPairMap*/<br />
      int count;<br />
            <br />
      /** The association score (temporary storage). */<br />
      float score = 0;<br />
            <br />
<br />
      /** Constructor with term and count. */<br />
      TermValue(String term, int count) {<br />
         this.term = term;<br />
         this.count = count;<br />
      }<br />
         <br />
      /** Constructor with term and score. */<br />
      TermValue(String term, float score) {<br />
         this.term = term;<br />
         this.score = score;<br />
      }<br />
      <br />
      /** <br />
       * Specialized so that term's hashCode is returned.<br />
       * @return Term's hashCode.<br />
       */<br />
      public int hashCode() {<br />
         return term == null ? 0 : term.hashCode();<br />
      }<br />
<br />
      /**<br />
       * Specialized so that only term is compared.<br />
       * @return true if terms of bot objects are equal.<br />
       */<br />
      public boolean equals(Object object) {<br />
         return ((object != null<br />
                  && object.getClass() == this.getClass()<br />
                  && ((TermValue) object).term.equals(this.term)));<br />
      }<br />
<br />
   } // *** end of inner class  TermValue  *************************************<br />
      <br />
   /**<br />
    * Does a complete scan of the input file and for each document found<br />
    * it calls readDocument to get a list of the terms in the document,<br />
    * then it calls addTermPairsToPairList which scans a window through the<br />
    * list of terms, scans for term pairs within the window, produces a set of<br />
    * unique term pairs and increments pairMap with them.<br />
    * <br />
    * @param docsF01file Input file name.<br />
    * @throws IOException <br />
    */<br />
   public void processAllDocuments(String docsF01file) throws IOException {<br />
      consoleReport.fileF01 = docsF01file;<br />
      int docsCount = 0;<br />
<br />
      BufferedReader in = new BufferedReader(new FileReader(docsF01file));<br />
      String line;<br />
      while ((line = in.readLine()) != null) {<br />
         StringTokenizer st = new StringTokenizer(line);<br />
         if ( ! st.hasMoreTokens() || ! st.nextToken().equals("<T>")) {<br />
            continue;<br />
         }<br />
<br />
         // Document has been found. Get a docAsTermList and add to termList.<br />
         ArrayList docAsTermList = readDocument(in);<br />
         addTermPairsToTermPairMap(docAsTermList);<br />
         docsCount++;<br />
      }<br />
      in.close();<br />
<br />
      if (docsCount == 0) {<br />
         throw new RuntimeException("No documets to read.");<br />
      }<br />
      if (windowsScanned == 0) {<br />
         throw new RuntimeException("No windows in documents.");<br />
      }      <br />
      consoleReport.docsReadF01 = docsCount;<br />
   }<br />
<br />
   /**<br />
    * Does a complete scan of the input file and for each document found<br />
    * it calls readDocument to get a list of the terms in the document,<br />
    * then calls addTermsToTermMap to increment termMap once per unique term. <br />
    * <br />
    * @param docsF01file The file containing the documents.<br />
    * @throws IOException<br />
    */<br />
   private void prescanAllDocuments(String docsF01file) throws IOException {<br />
<br />
      BufferedReader in = new BufferedReader(new FileReader(docsF01file));<br />
      String line;<br />
      while ((line = in.readLine()) != null) {<br />
         StringTokenizer st = new StringTokenizer(line);<br />
         if ( ! st.hasMoreTokens() || ! st.nextToken().equals("<T>")) {<br />
            continue;<br />
         }<br />
<br />
         // Document has been found. Get a docAsTermList and add to termList.<br />
         ArrayList docAsTermList = readDocument(in);<br />
         addTermsToTermMap(docAsTermList);<br />
      }<br />
      in.close();<br />
   }<br />
<br />
   /**<br />
    * Reads words in the document, forms them into terms and returns the terms <br />
    * in an ArrayList in the document order.<br />
    * @param in The input BufferedReader set to the first line of this document. <br />
    * @return An ArrayList of terms in the document. <br />
    */<br />
   private ArrayList readDocument(BufferedReader in) throws IOException {               <br />
<br />
      ArrayList wordList = new ArrayList();      <br />
      ArrayList docAsTermList = new ArrayList();<br />
<br />
      // Read all lines in this document into wordList.<br />
      String line;<br />
      String word;<br />
      while ((line = in.readLine()) != null) {<br />
         StringTokenizer st = new StringTokenizer(line);<br />
         if ( ! st.hasMoreTokens()) {<br />
            continue;<br />
         }<br />
         word = st.nextToken(); <br />
         if (word.equals("</T>")) {<br />
            break;<br />
         }<br />
         wordList.add(word);<br />
         while (st.hasMoreTokens()) {<br />
            wordList.add(st.nextToken());<br />
         }<br />
      }<br />
<br />
      // If wordCount < windowSize there is nothing more to do.<br />
      if (wordList.size() < windowSize) {<br />
         return docAsTermList;<br />
      }<br />
         <br />
      // Create the docAsTermList of all terms in this document.<br />
      StringBuffer termBuffer = new StringBuffer();<br />
      for (int i = 0; i < wordList.size() - termSize + 1; i++ ) {<br />
            <br />
         // Build the term and add to docAsTermList.<br />
         termBuffer.delete(0, Integer.MAX_VALUE);<br />
         termBuffer.append(wordList.get(i));<br />
         for (int j = 1; j < termSize; j++ ) {<br />
            termBuffer.append(" ").append(wordList.get(i + j)); <br />
         }<br />
         docAsTermList.add(new String(termBuffer));<br />
      }<br />
      return docAsTermList;<br />
   }<br />
         <br />
   /**<br />
    * For each unique term in docAsTermList it increments the count in termMap.<br />
    * Fist it creates a set of unique terms, then it adds these terms to<br />
    * termMap. <br />
    * <br />
    * @param docAsTermList A list of terms in one document.<br />
    */<br />
   public void addTermsToTermMap(ArrayList docAsTermList) { <br />
      int winSize = windowSize > 0 ? windowSize : docAsTermList.size(); <br />
<br />
      // Nothing to do if the docAsTermList is less than winSize.<br />
      if (docAsTermList.size() < winSize - termSize + 1) {<br />
         return;<br />
      }<br />
<br />
      // Add all terms in the document to termMap, but without counts.<br />
      for (int i = 0; i < docAsTermList.size(); i++ ) {<br />
         termMap.incCount((String) docAsTermList.get(i), 0);<br />
      }         <br />
   }<br />
<br />
   /**<br />
    * For each unique term pair in each scanning window in the docAsTermList it<br />
    * increments the pair count in termPairMap.<br />
    * For each scanning window it first creates a set of unique term pairs,<br />
    * then it adds the term pairs to termPairMap.<br />
    * <br />
    * @param docAsTermList A list of terms in one document.<br />
    */<br />
   public void addTermPairsToTermPairMap(ArrayList docAsTermList) { <br />
      int winSize = windowSize > 0 ? windowSize : docAsTermList.size(); <br />
      int windowsScanned = 0;<br />
      // Nothing to do if the docAsTermList is less than winSize.<br />
      if (docAsTermList.size() < winSize - termSize + 1) {<br />
         return;<br />
      }<br />
<br />
      // Create a valueList from termMap to match the docAsTermList.<br />
      final ArrayList valueList = new ArrayList(docAsTermList.size());<br />
      for (int i = 0; i < docAsTermList.size(); i++ ) {<br />
         valueList.add(termMap.get((String) docAsTermList.get(i)));<br />
      }<br />
<br />
      // Move window through document.<br />
      float score = 0;<br />
      int termCnt = docAsTermList.size();<br />
      for (int i = 0; i < termCnt - termSize + 1; i++ ) {<br />
<br />
         int wEnd = i + winSize - termSize + 1 < termCnt ?<br />
                                          i + winSize - termSize + 1: termCnt;<br />
<br />
         // Create a termSet for all terms in this window.<br />
         HashSet termSet = new HashSet();<br />
<br />
         // Create a TermPairMap to put the term pairs in initially.<br />
         TermPairMap windowPairMap = new TermPairMap(winSize - termSize + 1);<br />
<br />
         // Move term1 through window, right to end, adding TermValues in pairs<br />
         // from valueList to windowPairMap. <br />
         int jEnd = i + winSize - termSize + 1 < termCnt ?<br />
                                          i + winSize - termSize + 1: termCnt;<br />
<br />
         for (int j = i; j < wEnd; j++ ) {<br />
            // Add the first term in window to termSet.<br />
            termSet.add(docAsTermList.get(j));<br />
            <br />
            // Select the first term in the pair.<br />
            TermValue tv1 = ((TermValue) valueList.get(j));<br />
            <br />
            // Move term2 through window to the right of term1.<br />
            int kEnd = i + winSize - termSize + 1 < termCnt ? <br />
                                          i + winSize - termSize + 1 : termCnt;  <br />
            for (int k = j + 1; k < wEnd; k++ ) {<br />
               windowsScanned++;<br />
               if (proxMethod) {<br />
                  score = 1 / (float) (k - j);<br />
               }<br />
               <br />
               // Select the second term in the pair.<br />
               TermValue tv2 = ((TermValue) valueList.get(k));<br />
<br />
               // Put pair of terms in uniform order so a,b will match b,a.<br />
               if (tv1.term.compareTo(tv2.term) < 0) {<br />
                  windowPairMap.inc(tv1, tv2, 1, score);<br />
               } else if (tv1.term.compareTo(tv2.term) > 0) {<br />
                  windowPairMap.inc(tv2, tv1, 1, score);<br />
               } else {<br />
                  // Do nothing if term1 equals term2.<br />
               }<br />
            }<br />
         }<br />
         // Add the term pairs to termPairMap.<br />
         windowPairMap.run(new TermPairMap.ForEach() {<br />
            public boolean run(Object key1, Object key2, int count, float score)<br />
            {<br />
               termPairMap.inc(key1, key2, 1, score);<br />
               return true; <br />
            }<br />
         });<br />
         <br />
         // Increment the termMap.count for each term in the window.<br />
         Iterator it = termSet.iterator();<br />
         while (it.hasNext()) {<br />
            String str = (String) it.next();<br />
            termMap.incCount(str, 1);<br />
         }<br />
      }<br />
      this.windowsScanned += windowsScanned;<br />
   }<br />
 <br />
<br />
   public void reportProgress() {<br />
      reportProgress("");<br />
   }<br />
<br />
<br />
   public void reportProgress(String message) {<br />
      if (REPORT_PROGRESS && message.length() > 0) {<br />
            System.out.println("# " + message + " @ t="+elapsedRunTime);<br />
      }<br />
   }<br />
<br />
   // **************************************************************************<br />
   // inner class    ElapsedRunTime<br />
   // **************************************************************************<br />
   /**<br />
    * When constructed records the start time.<br />
    * The toString() method returns a formatted elapsed time since start time.  <br />
    * <br />
    * @author George Smith<br />
    */<br />
   private class ElapsedRunTime {<br />
<br />
      /** The system time in mSeconds recorded at startup time. */<br />
      private long startTimeMSec = (new Date()).getTime();<br />
<br />
      /**<br />
       * Returns A formatted elapsed time since start time.<br />
       * @return A formatted elapsed time since start time.<br />
       */<br />
      public String toString() {<br />
         DecimalFormat dfInt1 = new DecimalFormat("#,##0");<br />
         DecimalFormat dfInt2 = new DecimalFormat("#,#00");<br />
         DecimalFormat dfInt3 = new DecimalFormat("#,000");<br />
<br />
         long nowMSec = (new Date()).getTime();<br />
         long elapsedMSec =  nowMSec - startTimeMSec;<br />
         long elapsedSec = elapsedMSec / 1000;<br />
         long elapsedMin = elapsedSec / 60;<br />
         long elapsedHours = elapsedMin / 60;<br />
<br />
         return (dfInt1.format(elapsedHours) + ":" +<br />
                 dfInt2.format(elapsedMin % 60) + ":" + <br />
                 dfInt2.format(elapsedSec % 60) + "." + <br />
                 dfInt3.format(elapsedMSec % 1000));<br />
      }<br />
   } // ****** end of inner class ElapsedRunTime  ******************************<br />
<br />
   // **************************************************************************<br />
   // inner class       MemoryInfo<br />
   // **************************************************************************<br />
   /**<br />
    * MemoryInfo provides easy access to JVM memory statistics. It also retains<br />
    * a value for the maximum memory used, as observed during any method call.<br />
    * <br />
    * @author George Smith<br />
    */<br />
   private class MemoryInfo {<br />
<br />
      /** The maximum memory usage that has been observed. */<br />
      private long maxUsedMemory = getUsedMemory();<br />
<br />
      /**<br />
       * Retruns the maximum used memory observed, after rechecking.<br />
       * @return The maximum used memory observed.<br />
       */<br />
      public long getMaxUsedMemory() {<br />
         getUsedMemory();<br />
         return maxUsedMemory;<br />
      }<br />
<br />
      /** <br />
       * Returns the maximum additional memory which can be used.<br />
       * @return the maximum additional memory which can be used.<br />
       */<br />
      public long headRoom() {<br />
         return Runtime.getRuntime().maxMemory() - getUsedMemory();<br />
      }<br />
<br />
      /**<br />
       * Returns cuurent memory usage and recalculates maxUsedMemory.<br />
       * @return Cuurent memory usage.<br />
       */<br />
      public long getUsedMemory() {<br />
         long nowUsedMemory = Runtime.getRuntime().totalMemory()<br />
                              - Runtime.getRuntime().freeMemory();<br />
         if (this.maxUsedMemory < nowUsedMemory) {<br />
            maxUsedMemory = nowUsedMemory;<br />
         }<br />
         return nowUsedMemory;<br />
      }<br />
      <br />
      /** Returns the current used memory as a formatted String. */<br />
      public String toString() {<br />
         DecimalFormat df0 = new DecimalFormat("#,##0");<br />
         return df0.format(getUsedMemory());<br />
      }<br />
   } // ****** end of inner class   MemoryInfo  ********************************<br />
   <br />
   // **************************************************************************<br />
   // inner class       ConsoleReport<br />
   // **************************************************************************<br />
   /*<br />
    * Accumulates console report statistics and then prints them.<br />
    * Specialized for P05context <br />
    * <br />
    * @author George Smith<br />
    */<br />
   class ConsoleReport {<br />
<br />
      /** An optional first line message. */<br />
      public String message = "";<br />
      /** The name on file F01. */<br />
      public String fileF01;<br />
      /** The name on file F10. */<br />
      public String fileF10;   <br />
      /** The number of documents read from file F01. */<br />
      public int docsReadF01 = -1;<br />
      /** The number of records written to file F10. */<br />
      public int recordsWrittenF10 = -1;<br />
      <br />
      /**<br />
       * Prints the required statistics summary to console. <br />
       */<br />
      public void printStatisticsSummary() {<br />
         DecimalFormat dfInt1 = new DecimalFormat("#,##0");<br />
<br />
         // Print statistics.<br />
         System.out.println(message);<br />
         System.out.println(<br />
            "Number of documents read from " + this.fileF01 + " = " +<br />
            dfInt1.format(this.docsReadF01));<br />
         System.out.println(<br />
            "Number of records written to " + this.fileF10 + " = " + <br />
            dfInt1.format(this.recordsWrittenF10));<br />
         System.out.println(<br />
            "Program execution elapsed time (H:MM:SS.mS) = " + elapsedRunTime); <br />
         // Print maximum memory used.<br />
         System.out.println(<br />
           "Maximum memory used = " + dfInt1.format(memory.getMaxUsedMemory()));<br />
<br />
         // Print the number of unique terms found.<br />
         System.out.println(<br />
            "Unique terms = " + dfInt1.format(termMap.size()));<br />
<br />
         // Print the number of unique term pairs found.<br />
         System.out.println(<br />
            "Unique term pairs = " + dfInt1.format(termPairMap.size()));<br />
<br />
      }<br />
   } // ****** end of inner class  ConsoleReport  ***********************<br />
<br />
<br />
   /**<br />
    * Used to hold an index of IndexLists. Each IndexList holds indexes for <br />
    * one term. The indexes in IndexList refer to TermPairs in termPairMap<br />
    * which relate to that term.  <br />
    * This array of IndexList objects must be correctly sized by the <br />
    * constructor. It does not grow, although each IndexList can.<br />
    * <br />
    * @author George Smith<br />
    */<br />
   private static class IndexListList {<br />
<br />
      /** The data store for this object, an array of IndexList. */<br />
      private IndexList[] indexLists;<br />
<br />
      <br />
      /** The constructor sets the capacity of the store. */<br />
      public IndexListList(int size) {<br />
         indexLists = new IndexList[size];<br />
         for (int i = 0; i < size; i++ ) {<br />
            indexLists[i] = new IndexList();<br />
         }<br />
      }<br />
      <br />
      /** Clears old data in each IndexList (by resetting their sizes to 0). */<br />
      public void clear() {<br />
         for (int i = 0; i < indexLists.length; i++ ) {<br />
            indexLists[i].clear();<br />
         }<br />
      }<br />
      <br />
      /** Gets the number of IndexLists available. */<br />
      public int size() {<br />
         return indexLists.length;<br />
      }<br />
      <br />
      /** Returns the IndexList at index i. */<br />
      public IndexList index(int i) {<br />
         return indexLists[i];<br />
      }<br />
      <br />
      /**<br />
       * Used with TermPairMap.ForEachWithIndex to extract the index from<br />
       * TermPairMap for each term which has an index mark in either key1 or<br />
       * key2's count field and put it into indexListList at the correct place!<br />
       * <br />
       * @param key1<br />
       * @param key2<br />
       * @param count<br />
       * @param score<br />
       * @param index<br />
       * @return Always returns true;<br />
       */<br />
      public boolean extractIndexes(Object key1, Object key2, <br />
                                          int count, float score, int index) {<br />
         TermValue tv1 = (TermValue) key1;<br />
<br />
         if (tv1.count >= 0) {<br />
            indexLists[tv1.count].add(index);<br />
         }<br />
         TermValue tv2 = (TermValue) key2;<br />
         if (tv2.count >= 0) {<br />
            indexLists[tv2.count].add(index);<br />
         }<br />
         return true;                            <br />
      }<br />
   }<br />
}

-----------------------------------------------------------------------------
QuestionWhat programming language should I use Pin
17-Feb-05 19:56
suss17-Feb-05 19:56 
AnswerRe: What programming language should I use Pin
SimonS18-Feb-05 6:52
SimonS18-Feb-05 6:52 
GeneralMousedown event on DIV - help urgent Pin
Venkat Eswaran17-Feb-05 17:12
Venkat Eswaran17-Feb-05 17:12 
GeneralRe: Mousedown event on DIV - help urgent Pin
Mike Ellison17-Feb-05 18:12
Mike Ellison17-Feb-05 18:12 
GeneralClient side file capture Pin
MichaelJFC17-Feb-05 6:22
MichaelJFC17-Feb-05 6:22 
Generali need code to design Pin
MFB8017-Feb-05 1:58
MFB8017-Feb-05 1:58 
GeneralRe: i need code to design Pin
Luis Alonso Ramos19-Feb-05 12:17
Luis Alonso Ramos19-Feb-05 12:17 
GeneralTemplate in FP2003 and Dreamweaver MX Pin
nguyennp16-Feb-05 22:51
nguyennp16-Feb-05 22:51 
Generalback button Pin
keepsmile16-Feb-05 21:20
keepsmile16-Feb-05 21:20 
GeneralRe: back button Pin
Colin Angus Mackay17-Feb-05 2:12
Colin Angus Mackay17-Feb-05 2:12 
GeneralRe: back button Pin
keepsmile18-Feb-05 20:42
keepsmile18-Feb-05 20:42 
GeneralRe: back button Pin
Colin Angus Mackay19-Feb-05 0:59
Colin Angus Mackay19-Feb-05 0:59 
GeneralJavascript password protection and session management Pin
matt724516-Feb-05 14:19
matt724516-Feb-05 14:19 
Generalcode for downloading pdf file in asp 3.0 Pin
ashu_sharma2116-Feb-05 0:47
ashu_sharma2116-Feb-05 0:47 
GeneralRe: code for downloading pdf file in asp 3.0 Pin
Luis Alonso Ramos19-Feb-05 12:28
Luis Alonso Ramos19-Feb-05 12:28 
Generalsetting timer for folders Pin
trupgmtuf15-Feb-05 12:05
susstrupgmtuf15-Feb-05 12:05 
GeneralHELP PLZ...Download page to file Pin
A.Sal15-Feb-05 9:22
A.Sal15-Feb-05 9:22 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.