`
`(11)Pub|ication number :
`
`2004-213091
`
`(43)Date of publication of application : 29.07.2004
`
`(51)Int.Cl.
`
`GOGF 17/30
`
`606K 9/00
`
`606T 1/00
`
`ifiraééfivwd%4fieflir%’tfil4¥.g,Mw.ifiFfi1E—~E£§‘Wjfliifi‘¥r.mflyiflflfic’%fiuwflfl§w3§m"Efifl?¥u&fimfic§$4%=wfil8%
`
`(21)Application number : 2002-378481 (71)Applicant : CANON INC
`CANON SALES CO INC
`
`(72)Inventor : YOSHII YUKIHIRO
`26.12.2002
`(22)Date of filing :
`Eifiikxifiiifiififllifimug-iii?ifiiifizfifi‘imiifi—tfiiififififiEfiaififiifiifliflfiifififififiiifimfiiflfiifii‘fiifimmfifi:fifififiifififiifiimimifliifiiifiififiifiifiifflflifi‘fi
`
`(54) DEVICE FOR SEARCHING DOCUMENT IMAGE, AND METHOD THEREFOR,
`
`SYSTEM FOR SEARCHING DOCUMENT IMAGE, AND PROGRAM
`
`(57)Abstract:
`
`PROBLEM TO BE SOLVED: To provide a
`
`document image search device, its method, a
`
`document image search system and a program
`
`which permit efficient search for the desired
`
`result of OCR, and which permit easy visual
`
`identification of the desired character image in
`
`a document image corresponding to the result
`
`of OCR.
`
`‘ of the search result, the server carries out a
`
`SOLUTION: An OCR server 300 carries out a
`
`search, using a first search string constituting
`
`“ .
`
`inputted search conditions. Then, on the basis
`
`.
`
`search again, using a second search string
`
`generated by replacing a part of the first
`
`2 ,0
`\ ,5
`
`
`
`search string with a wild—card. Furthermore, the server generates a comparative
`
`display image of a document image, corresponding to the result of OCR which has
`
`been searched for or searched for again.
`
`* NOTICES *
`
`JPO and INPIT are not responsible for any
`
`damages caused by the use of this translation.
`
`1.This document has been translated by computer. So the translation may not reflect
`
`the original precisely.
`
`2.**** shows the word which can not be translated.
`
`3.[n the drawings, any words are not translated.
`
`[Claim(s)]
`
`[Claim 1]
`
`It is a document image retrieval device with which an OCR result which manages an
`
`OCR result of a document image and agrees in this search condition based on an
`
`input search condition is searched,
`
`A 1st search means to perform a search by the 1st search string that constitutes a
`
`search condition input [ aforementioned ],
`
`A 2nd search means to perform re retrieval by the 2nd search string that replaced
`
`this a part of lst search string to a wild card based on search results of said 1st
`
`search means.
`
`An OCR result searched with said 1st or 2nd search means, and a creating means
`
`which generates a contrast display image of a corresponding document image
`
`A document image retrieval device characterized by preparation ******.
`
`[Claim 2]
`
`The aforementioned contrast display image differs in the display attribute of a
`
`character image corresponding to said 1st search string or the 2nd search string in
`
`the aforementioned document image, and the display attribute of other other
`
`character images.
`
`The document image retrieval device according to claim 1 characterized by things.
`
`
`
`[Claim 3]
`
`A memory means which memorizes an erroneous recognition character list which
`
`matches and manages a character group which erroneous recognition is easy to be
`
`carried out,
`
`the 1st character managed based on search results of said 2nd search means by the
`aforementioned erroneous recognition character list of[ in said 1st search string] --
`
`this -- having further a 3rd search means to perform re retrieval by the 3rd search
`
`string replaced to the 2nd character matched with the lst character
`
`The aforementioned creating means generates a contrast display image of an OCR
`
`result searched with said 3rd search means, and a corresponding document image.
`
`The document image retrieval device according to claim 1 characterized by things.
`
`[Claim 4]
`
`Said 2nd search means until it performs a search by a search string which replaced
`
`the 2nd character of said lst search string to a wild card and search results are
`
`obtained, when search results by said 2nd search string are not obtained, A search
`
`by a search string which replaced each character in this lst search string to a wild
`
`card is performed.
`
`The document image retrieval device according to claim 1 characterized by things.
`
`[Claim 5]
`
`Said 2nd search means replaces a character more than a predetermined stroke
`
`count in said 1st search string to a wild card.
`
`The document image retrieval device according to claim 1 characterized by things.
`
`[Claim 6]
`
`It is a document image search system which a document image retrieval server which
`
`searches an OCR result which manages an OCR result of a document image and
`
`agrees in this search condition based on an input search condition. and a terminal for
`search which inputs the aforementioned search condition are mutually connected via
`
`a network, and is constituted,
`
`The aforementioned terminal for search,
`
`An input means which inputs a search condition.
`
`The 1st transmitting means that transmits the aforementioned search condition to
`
`the aforementioned document image retrieval server,
`
`The ist receiving means that receives search results corresponding to the
`
`aforementioned search condition from the aforementioned document image retrieval
`
`server,
`
`It has a displaying means which displays the aforementioned search results,
`
`
`
`The aforementioned document image retrieval server.
`
`The 2nd receiving means that receives a search condition from the aforementioned
`
`terminal for search,
`
`A 1st search means to perform a search by the 1st search string that constitutes the
`
`aforementioned search condition.
`
`A 2nd search means to perform re retrieval by the 2nd search string that replaced
`
`this a part of lst search string to a wild card based on search results of said 1st
`
`search means,
`
`An OCR result searched with said lst or 2nd search means, and a creating means
`
`which generates a contrast display image of a corresponding document image
`
`The 2nd transmitting means that transmits the aforementioned contrast display
`
`image to the aforementioned terminal for search
`
`A document image search system characterized by preparation ******.
`
`[Claim 7]
`
`it is a document image search method which searches an OCR result which manages
`
`an OCR result of a document image and agrees in this search condition based on an
`
`input search condition,
`
`The 1st retrieval process that performs a search by the 1st search string that
`
`constitutes a search condition input [ aforementioned ],
`
`The 2nd retrieval process that performs re retrieval by the 2nd search string that
`
`replaced this a part of lst search string to a wild card based on search results of said
`
`lst retrieval process.
`
`An OCR result searched with said ist or 2nd retrieval process, and a production step
`
`which generates a contrast display image of a corresponding document image
`
`A document image search method characterized by preparation ******.
`
`[Claim 8]
`
`The aforementioned contrast display image differs in the display attribute of a
`
`character image corresponding to said 1st search string or the 2nd search string in
`
`the aforementioned document image, and the display attribute of other other
`
`character images.
`
`The document image search method according to claim 7 characterized by things.
`
`[Claim 9]
`
`A memory process of memorizing an erroneous recognition character list which
`
`matches and manages a character group which erroneous recognition is easy to be
`
`carried out,
`
`the 1st character managed based on search results of said 2nd retrieval process by
`
`
`
`the aforementioned erroneous recognition character list of[ in said 1st search
`
`string ] -- this -- having further the 3rd retrieval process that performs re retrieval
`
`by the 3rd search string replaced to the 2nd character matched with the lst
`
`character
`
`The aforementioned production step generates a contrast display image of an OCR
`
`result searched with said 3rd retrieval process, and a corresponding document image.
`
`The document image search method according to claim 7 characterized by things.
`
`[Claim 10]
`
`Said 2nd retrieval process until it performs a search by a search string which
`
`replaced the 2nd character of said lst search string to a wild card and search results
`
`are obtained, when search results by said 2nd search string are not obtained, A
`
`search by a search string which replaced each character in this lst search string to a
`
`wild card is performed.
`
`The document image search method according to claim 7 characterized by things.
`
`[Claim 11]
`
`Said 2nd retrieval process replaces a character more than a predetermined stroke
`
`count in said lst search string to a wild card.
`
`The document image search method according to claim 7 characterized by things.
`
`[Claim 12]
`
`It is the control method of a document image search system which a document image
`
`retrieval server which searches an OCR result which manages an OCR result of a
`
`document image and agrees in this search condition based on an input search
`
`condition, and a terminal for search which inputs the aforementioned search
`
`condition are mutually connected via a network, and is constituted,
`
`An input process which inputs a search condition,
`
`The lst transmission process that transmits the aforementioned search condition to
`
`the aforementioned document image retrieval server,
`
`The 1st retrieval process that performs a search by the 1st search string that
`
`constitutes the aforementioned search condition,
`
`The 2nd retrieval process that performs re retrieval by the 2nd search string that
`
`replaced this a part of 1st search string to a wild card based on search results of said
`
`lst retrieval process,
`
`An OCR result searched with said 1st or 2nd retrieval process, and a production step
`
`which generates a contrast display image of a corresponding document image
`
`The 2nd transmission process that transmits the aforementioned contrast display
`
`image to the aforementioned terminal for search
`
`
`
`A control method of a document image search system characterized by preparation
`******.
`
`[Claim 13]
`
`[t is a program for operating document image search which searches an OCR result
`
`which manages an OCR result of a document image and agrees in this search
`
`condition based on an input search condition as a computer,
`
`A program code of the lst retrieval process that performs a search by the lst search
`
`string that constitutes a search condition input [ aforementioned ],
`
`A program code of the 2nd retrieval process that performs re retrieval by the 2nd
`
`search string that replaced this a part of lst search string to a wild card based on
`
`search results of said lst retrieval process,
`
`An OCR result searched with said lst or 2nd retrieval process, and a program code
`
`of a production step which generates a contrast display image of a corresponding
`
`document image
`
`A program characterized by preparation ******.
`
`[Claim 14]
`
`A document image retrieval server which searches an OCR result which manages an
`
`OCR result of a document image and agrees in this search condition based on an
`
`input search condition, It is a program for operating control of a document image
`
`search system which a terminal for search which inputs the aforementioned search
`
`condition is mutually connected via a network, and is constituted as a computer,
`
`A program code of an input process which inputs a search condition,
`
`A program code of the lst transmission process that transmits the aforementioned
`
`search condition to the aforementioned document image retrieval server,
`
`A program code of the 1st retrieval process that performs a search by the lst search
`
`string that constitutes the aforementioned search condition,
`
`A program code of the 2nd retrieval process that performs re retrieval by the 2nd
`
`search string that replaced this a part of lst search string to a wild card based on
`
`search results of said 1st retrieval process,
`
`An OCR result searched with said 1st or 2nd retrieval process, and a program code
`
`of a production step which generates a contrast display image of a corresponding
`
`document image
`
`A program code of the 2nd transmission process that transmits the aforementioned
`
`contrast display image to the aforementioned terminal for search
`
`A program characterized by preparation ******.
`
`
`
`[Translation done.]
`
`* NOTICES *
`
`JPO and INPIT are not responsible for any
`
`damages caused by the use of this translation.
`
`1.This document has been translated by computer. So the translation may not reflect
`
`the original precisely.
`
`2.**** shows the word which can not be translated.
`
`3.In the drawings. any words are not translated.
`
`DETAILED DESCRIPTION
`
`[Detailed Description of the Invention]
`
`[0001]
`
`[Field of the Invention]
`
`The present invention relates to a document image retrieval device with which the
`
`OCR result which manages the OCR result of a document image and agrees in this
`
`search condition based on the input search condition is searched and a method for
`
`the same. a document image search system, and a program.
`
`[0002]
`
`[Description of the Prior Art]
`
`As electronic data conversion of a document, a document image is input
`
`conventionally and the technology of performing OCR (Optical Character
`
`Recognition: optical character recognition) to the document image is known. The
`
`document control device with which the document image of an OCR object and the
`
`document image which matches and manages the OCR result (character coded data)
`
`obtained as that OCR result, and corresponds using that OCR result are searched as
`
`this applied technology is realized.
`
`[0003]
`
`Search with this document control device the OCR result which consists of that
`
`character coded data in inputting the character coded data used as the search
`
`
`
`condition included in the document image of a retrieval object, and the searched
`
`OCR result is displayed, or a corresponding document image is displayed. Under the
`
`present circumstances. the character coded data input as a search condition
`
`included in an OCR result is displaying in distinction from other character coded data,
`
`and can report a retrieval situation to a user.
`
`[0004]
`
`JP,H9—237320,A has disclosed the technology which generates the file which can
`
`display the reading document which is restored in the range of the character
`
`displayed by a character code, and the format of a reading document can recognize
`
`visually comfortable.
`
`[0005]
`
`JP,H10-134141,A has disclosed the technology of reading optically the character
`
`described on the sheet, acquiring a recognition result, comparing electronic data and
`
`a recognition result corresponding to the character described on the aforementioned
`
`sheet previously stored in the storage medium, acquiring a matching result. switching
`
`the method of presentation, displaying the character of the electronic data according
`
`to a matching result, and checking a matching result by viewing.
`
`[0006]
`
`[Problem to be solved by the invention]
`
`However, OCR described by the above-mentioned prior art does not go that
`
`recognition is possible 100%, but erroneous recognition is included not a little in the
`
`OCR result. Therefore, even when an OCR result and a corresponding document
`
`image were searched by making into a search condition the character coded data
`
`which will be carried out, for example if contained in an OCR result, satisfying search
`
`results could not be obtained.
`
`[0007]
`
`When searching the character image of the request in a document image
`
`corresponding from an OCR result, The character coded data made into the OCR
`
`result of a desired character image from an OCR result is once input as a search
`
`condition, Based on the display position of the character coded data of the search
`
`condition which is on an OCR result including the search condition, and is displayed
`
`in distinction from other character coded data, the character image of the request in
`
`a corresponding document image needed to be searched with viewing, and it had
`
`taken time and effort.
`
`[0008]
`
`The present invention is made in order to solve the above-mentioned problem, and it
`
`
`
`is a thing.
`
`The purpose is to provide a document image retrieval device which can search a
`
`result efficiently and can recognize visually easily a character image of a request in a
`
`document image corresponding to the OCR result and a method for the same, a
`
`document image search system, and a program.
`
`[0009]
`
`[Means for solving problem]
`
`The document image retrieval device by the present invention for attaining the
`
`above-mentioned purpose is provided with the following composition. namely
`
`[t is a document image retrieval device with which the OCR result which manages the
`
`OCR result of a document image and agrees in this search condition based on the
`
`input search condition is searched,
`
`A lst search means to perform a search by the 1st search string that constitutes the
`
`search condition input [ aforementioned ],
`
`A 2nd search means to perform re retrieval by the 2nd search string that replaced
`
`this a part of lst search string to the wild card based on the search results of the
`
`above-mentioned lst search means,
`
`The OCR result searched with the above-mentioned lst or 2nd search means, and
`
`the creating means which generates the contrast display image of a corresponding
`
`document image
`********,
`
`[0010]
`
`The aforementioned contrast display image differs in the display attribute of the
`
`character image corresponding to the lst above-mentioned search string or the 2nd
`
`search string in the aforementioned document image, and the display attribute of
`
`other other character images preferably.
`
`[0011]
`
`The memory means which memorizes the erroneous recognition character list which
`
`matches and manages preferably the character group which erroneous recognition is
`
`easy to be carried out,
`
`the lst character managed based on the search results of the above-mentioned 2nd
`
`search means by the aforementioned erroneous recognition character list of [ in the
`
`1st above-mentioned search string] -- this - having further a 3rd search means to
`
`perform re retrieval by the 3rd search string replaced to the 2nd character matched
`with the 1st character
`
`
`
`The aforementioned creating means generates the contrast display image of the
`
`OCR result searched with the above—mentioned 3rd search means, and a
`
`corresponding document image.
`
`[0012]
`
`Preferably, the above-mentioned 2nd search means until it performs a search by the
`
`search string which replaced the 2nd character of the lst above-mentioned search
`
`string to the wild card and search results are obtained, when the search results by
`
`the 2nd above—mentioned search string are not obtained. A search by the search
`
`string which replaced each character in this 1st search string to the wild card is
`
`performed.
`
`[0013]
`
`The above-mentioned 2nd search means replaces the character more than the
`
`predetermined stroke count in the 1st above-mentioned search string to a wild card
`
`preferably.
`
`[0014]
`
`The document image search system by the present invention for attaining the
`
`above—mentioned purpose is provided with the following composition. namely
`
`[t is a document image search system which the document image retrieval server
`
`which searches the OCR result which manages the OCR result of a document image
`
`and agrees in this search condition based on the input search condition, and the
`
`terminal for search which inputs the aforementioned search condition are mutually
`
`connected via a network, and is constituted,
`
`The aforementioned terminal for search,
`
`The input means which inputs a search condition,
`
`The 1st transmitting means that transmits the aforementioned search condition to
`
`the aforementioned document image retrieval server,
`
`The 1st receiving means that receives the search results corresponding to the
`
`aforementioned search condition from the aforementioned document image retrieval
`
`server,
`
`It has a displaying means which displays the aforementioned search results,
`
`The aforementioned document image retrieval server,
`
`The 2nd receiving means that receives a search condition from the aforementioned
`
`terminal for search,
`
`A 1st search means to perform a search by the lst search string that constitutes the
`
`aforementioned search condition,
`
`A 2nd search means to perform re retrieval by the 2nd search string that replaced
`
`
`
`this a part of 1st search string to the wild card based on the search results of the
`
`above—mentioned 1st search means,
`
`The OCR result searched with the above—mentioned 1st or 2nd search means, and
`
`the creating means which generates the contrast display image of a corresponding
`
`document image
`
`The 2nd transmitting means that transmits the aforementioned contrast display
`
`image to the aforementioned terminal for search
`********,
`
`[0015]
`
`The document image search method by the present invention for attaining the
`
`above-mentioned purpose is provided with the following composition. namely
`
`[t is a document image search method which searches the OCR result which
`
`manages the OCR result of a document image and agrees in this search condition
`
`based on the input search condition,
`
`The 1st retrieval process that performs a search by the 1st search string that
`
`constitutes the search condition input [ aforementioned ],
`
`The 2nd retrieval process that performs re retrieval by the 2nd search string that
`
`replaced this a part of 1st search string to the wild card based on the search results
`
`of the ist above-mentioned retrieval process,
`
`The OCR result searched with the 1st or 2nd above-mentioned retrieval process,
`
`and the production step which generates the contrast display image of a
`
`corresponding document image
`********,
`
`[0016]
`
`The control method of the document image search system by the present invention
`
`for attaining the above-mentioned purpose is provided with the following composition.
`
`namely
`
`[t is the control method of the document image search system which the document
`
`image retrieval server which searches the OCR result which manages the OCR result
`
`of a document image and agrees in this search condition based on the input search
`
`condition, and the terminal for search which inputs the aforementioned search
`
`condition are mutually connected via a network, and is constituted,
`
`The input process which inputs a search condition,
`
`The 1st transmission process that transmits the aforementioned search condition to
`
`the aforementioned document image retrieval server,
`
`The 1st retrieval process that performs a search by the 1st search string that
`
`
`
`constitutes the aforementioned search condition,
`
`The 2nd retrieval process that performs re retrieval by the 2nd search string that
`
`replaced this a part of 1st search string to the wild card based on the search results
`
`of the lst above-mentioned retrieval process.
`
`The OCR result searched with the lst or 2nd above-mentioned retrieval process,
`
`and the production step which generates the contrast display image of a
`
`corresponding document image
`
`The 2nd transmission process that transmits the aforementioned contrast display
`
`image to the aforementioned terminal for search
`********,
`
`[0017]
`
`The program by the present invention for attaining the above-mentioned purpose is
`
`provided with the following composition. namely
`
`It is a program for operating the document image search which searches the OCR
`
`result which manages the OCR result of a document image and agrees in this search
`
`condition based on the input search condition as a computer.
`
`The program code of the 1st retrieval process that performs a search by the 1st
`
`search string that constitutes the search condition input [ aforementioned ],
`
`The program code of the 2nd retrieval process that performs re retrieval by the 2nd
`
`search string that replaced this a part of 1st search string to the wild card based on
`
`the search results of the ist above-mentioned retrieval process,
`
`The OCR result searched with the lst or 2nd above-mentioned retrieval process,
`
`and the program code of the production step which generates the contrast display
`
`image of a corresponding document image
`********,
`
`[0018]
`
`The program by the present invention for attaining the above—mentioned purpose is
`provided with the following composition. namely
`
`The document image retrieval server which searches the OCR result which manages
`
`the OCR result of a document image and agrees in this search condition based on
`
`the input search condition, It is a program for operating control of the document
`
`image search system which the terminal for search which inputs the aforementioned
`
`search condition is mutually connected via a network, and is constituted as a
`
`computer,
`
`The program code of an input process which inputs a search condition,
`
`The program code of the lst transmission process that transmits the
`
`
`
`aforementioned search condition to the aforementioned document image retrieval
`
`server,
`
`The program code of the 1st retrieval process that performs a search by the lst
`
`search string that constitutes the aforementioned search condition,
`
`The program code of the 2nd retrieval process that performs re retrieval by the 2nd
`
`search string that replaced this a part of 1st search string to the wild card based on
`
`the search results of the 1st above-mentioned retrieval process,
`
`The OCR result searched with the 1st or 2nd above-mentioned retrieval process,
`
`and the program code of the production step which generates the contrast display
`
`image of a corresponding document image
`
`The program code of the 2nd transmission process that transmits the
`
`aforementioned contrast display image to the aforementioned terminal for search
`********.
`
`[0019]
`
`[Mode for carrying out the invention]
`
`Hereafter, with reference to Drawings, the embodiment of the present invention is
`
`described in detail.
`
`[0020]
`
`Fig.1 is a figure showing the composition of the document image search system of
`this embodiment.
`
`[0021]
`
`100 is PC for scanners (personal computer), controls the various operations
`
`including the input operation of the scanner 102, and saves the document image
`
`input from the scanner 102 at the preservation folder 101. Based on the instruction
`
`of the OCR server 300, the document image saved at the preservation folder 101 is
`
`transmitted to the OCR server 300, and it may be made to manage a document image
`
`in a unified manner by the OCR server 300.
`
`[0022]
`
`200 is an image tube ** server, for example, controls the various operations including
`
`the input operation of the network scanner 500 connected on the network 600, and
`
`saves the document image input from the network scanner 500 at the preservation
`
`folder 201. Based on the instruction of the OCR server 300, the document image
`
`saved at the preservation folder 201 is transmitted to the OCR server 300.
`
`[0023]
`
`The image tube ** server 200 may be constituted so that the document image input
`
`with the network scanner 500 may not be saved at the preservation folder 201 but it
`
`
`
`may save on the memory storage connected on the network 600, or other PCs. In
`
`this case, to the preservation folder 201, the position information (for example, an
`
`address, URL, an [P address, etc.) which shows the preservation destination of a
`
`document image is managed.
`
`[0024]
`
`300 is an OCR server, uses as image data the document image received via the
`
`network 600, saves it in the database 301, and performs OCR of the document image
`
`and saves it in the database 302 by making the OCR result into text data with a form.
`
`Under the present circumstances, the text data with a form which is the document
`
`image and its OCR result of an OCR object matches, and is managed. 303 is an
`
`erroneous recognition character list and has managed the erroneous recognition
`
`character group which is a character which erroneous recognition is easy to be
`
`carried out in OCR. As an erroneous recognition character, ""**, a "nose", a
`
`"basket", "
`
`**", etc. are mentioned, for example.
`
`[0025]
`
`Although the databases 301 and 302 are constituted independently, even if each
`
`database is constituted by different storage area on one storage medium, of course,
`
`they are not cared about.
`
`[0026]
`
`As text data with a form, the format realized with various word-processing software,
`
`such as Word (registered trademark) of Microsoft Corp. and Ichitaro (registered
`
`trademark) of JUST System, is mentioned, for example.
`
`[0027]
`
`It is PC for search, and as a search condition, 400 inputs a character code, it can
`
`make search results the document image corresponding to the text data with a form
`
`and it which are managed by the OCR server 300, and can display it, for example.
`
`[0028]
`
`500 is a network scanner and is a scanner in which a remote control is possible by
`
`the server and PC which are connected on the network 600.
`
`[0029]
`
`600 is a network and connects mutually the various components which constitute
`
`the document image search system of this embodiment.
`
`[0030]
`
`The various servers which constitute the document image search system of this
`
`embodiment have a WEB server function, and PC which accesses those servers
`
`accesses the WEB site which various servers provide using a WEB browser, and
`
`
`
`performs various processing. Various servers provide with the client program
`
`containing GUI (graphic user interface) for exclusive use, PC uses the client program,
`
`and it may be made to perform various processing besides this.
`
`[0031]
`
`At Fig.1, although P0100 for scanners and the network scanner 500 comprise one
`
`set. respectively, also when it comprises two or more sets, a certain thing cannot be
`
`overemphasized.
`
`[0032]
`
`Next, it describes using Eigiabout the hardware organization of the various
`
`terminals which constitute the document image search system of this embodiment,
`
`and a server.
`
`[0033]
`
`Eigiis a figure showing the hardware organization of the various terminals which
`
`constitute the document image search system of this embodiment, and each server.
`
`[0034]
`
`In F_ig_2, CPU21, RAM22, ROM23, LAN adapter 24, the video adapter 25, the input
`
`part (keyboard) 26, the input part (mouse) 27, the hard disk 28, and CD-ROM drive 29
`
`of each other are connected via the system bath 20, respectively. The system bath
`
`20 means a PCI bus, an AGP bus, a memory bus, etc., for example. In Egg. an
`
`interface for input and output like the chip for connection and keyboard interface
`
`between each bus, what is called SCSI, or ATAPI is omitted.
`
`[0035]
`
`CPU21 performs control of various kinds of operations, such as four operations and a
`
`comparison operation, hardware, or software. In RAM22, A program and an
`
`application program (the flow chart performed by each terminal mentioned later or a
`
`server) of operation system which were read from storage media with which the hard
`
`disk 28 and CD-ROM drive 29 were equipped, such as CD—ROM and CD-R Each
`
`program to execute is memorized and these are performed at the origin of control of
`
`CPU21.
`
`[0036]
`
`What is called BIOS etc. that ROM23 cooperates with operation system and manage
`
`the input and output to a hard disk etc. are memorized. LAN adapter 24 performs
`
`communication with the outside which cooperated with the communications program
`
`of the operation system controlled by CPU21, and passed the network. The video
`
`adapter 25 generates the picture signal outputted to display device (not shown), and
`
`in order that the input part (keyboard) 26 and the input part (mouse) 27 may input an
`
`
`
`instruction to a terminal, it is used.
`
`[0037]
`
`The hard disk 28 has memorized operation system and an above—mentioned
`
`application program, and is loaded to RAM22 the time of starting of a terminal, or if
`
`needed.
`
`[0038]
`
`CD-ROM drive 29 is used for equipping with storage media, such as CD-ROM, CD-R,
`
`CD-R/W, and installing an application program on the hard disk 28.
`
`[0039]
`
`It cannot be overemphasized that a CD-R drive, CD-R / W drive, an M0 drive, etc.
`
`may be used instead of CD-ROM drive 29.
`
`[0040]
`
`Next, it describes about the processing performed by the document image search
`
`system of this embodiment.
`
`[0041]
`
`The processing performed by the document image search system of this embodiment
`
`is divided largely, and consists of two processings. The document image management
`
`processing which one inputs a document image, performs OCR of the document
`
`image, and manages the document image and OCR result, and another are document
`
`image retrieval processings which search a desired OCR result and a corresponding
`
`document image using the managed OCR result.
`
`[0042]
`
`First, it describes about document image management processing using flg_3.
`
`[0043]
`
`Eigiiis a flow chart which shows the document image management processing of this
`embodiment.
`
`[0044]
`
`In EL3, OCR is performed for the document image input from the scanner 102
`
`connected to P0100 for scanners by the OCR server 300, and the case where the
`
`OCR result and document image a