help about mini-search-engine using c++

Thread Starter

moslem

Joined Dec 16, 2009
20
hello every one
i've aproject about mini_search_engine
and this is the what's doctor wrote to us
You are given a folder of input text or HTML files (say 50 files) and a set of keywords (say 40 keyword).
Build an efficient data structure (an index file) using hashing that will provide information as which files
contain a certain keyword and in which line. The structure should be saved on hard disk and loaded in
memory upon system startup. Then you build a small program in which a user will have a simple
interface asking for a keyword:​
Input Keyword to Search​
:
The user then types "water" for example.
The program then searches the index file and should have an output such as:

The keyword “water” exists in:
water-resources.html:
Line 212: The water problem in the middle east ....
Line 345: The Nile water is the main source of ...
Line 2003: Water is a blessing from God ...
arab-water-supply.txt:
Line 2: shortage of water in ..
Line 25: Libya has ample supply of water from rain ...​
and so on. Can you handle the rule that when a file is changed, the index is re-constructed?
please i want some ideas about that and the planner of the project
thanks alot
 

someonesdad

Joined Jul 7, 2009
1,583
This is a common and relatively straightforward task usually assigned in a beginning data structures course. Almost any algorithm/data structures book will have information on it. Instead of asking people for ideas, consult one of the many textbooks that discuss the technique; then let us know what parts of the problem you don't understand. Your problem statement doesn't say whether you have to design the hashing stuff yourself or whether you can use library stuff (writing your own hashing stuff is a bit more work, but a good exercise if you've never done it before -- especially devising decent strategies to deal with collisions). If you can use library stuff, there's a GNU or Boost STL dictionary/map structure based on hashing that could be used to satisfy the spirit of the problem, if not the teacher's wishes. :)

Hints: some of the major tasks are parsing the input, creating and populating the data structure you use, then gathering and processing the input from the user.
 

Ahmed2010

Joined Jun 30, 2010
3
i have the same project too ... Dr:Khaled fo2ad is international now
anyway , could u know how to read file name that u r already in it ?
and is there a function that get the line number ?
i thought about counter that increasing every line . but how could i know that i'm in anew line ?
 

Thread Starter

moslem

Joined Dec 16, 2009
20
the command getline gets the line till enter so when u write getline it'll be get all the line then put this command in for loop till the required line .
 

Ahmed2010

Joined Jun 30, 2010
3
but as i said before there is no ENTER in some files that mean there is no NULL at the end of the line ,so how u know that the line was end ?????
 

retched

Joined Dec 5, 2009
5,207
I there is no 'ENTER' then it can still be considered the same line.
It will not "wrap" until there is an 'ENTER' so if you have a very wide screen, it will be 1 line
 
Top