|
I am using full text search to search through PDF documents using the Adobe iFilter. Everything works fine. Now, is it possible that I get a sentence which contains my searched keyword? For example:
Keyword:
'fox'
Query result:
'The quick brown fox jumps over the lazy dog.'
'Fox is a small red animal.'
modified 3-Aug-12 7:53am.
|
|
|
|
|
Member 8024623 wrote: Now, is it possible that I get a sentence which contains my searched keyword?
Ehr.. yes, especially since you already mentioned that it works. Did you give it a try?
I got the feeling that I'm misunderstanding your question. English isn't my native tongue, and it helps if there's a bit explanation and some code to give an indication of what is expected. That said, kudo's for including the example with the expected output
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
I also think you've misunderstood my question When I said that everything works fine I meant that full text search over PDF documents works. I can retrieve a PDF document that contains a searched keyword. But, I need to retrieve a sentence or sentences from that or any other document that contain that keyword.
|
|
|
|
|
asimptota777 wrote: But, I need to retrieve a sentence or sentences from that or any other document that contain that keyword.
Then you will need IFilter s for each type of document that's in your database. There's no way of reading "every" document, since each fileformat (and their versions) have different encodings and layouts.
You can download those for Office here[^] (2007/2010).
If you have a document in there in your own file-format (binary serialized data?) you'd probably have to provide your own IFilter implementation (guidelines on MSDN).
All existing and wide-used formats should have an implementation, since the same IFilter is in use for Desktop Search, SharePoint and the likes. And of course, CodeProject[^] has a lot of articles on the subject.
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
Thank you for an answer but I think you haven't read my question carefully. In my first post I said that I store PDF documents in the database (in the Filestream to be precise)and I use Adobe iFilter for full text search through PDF documents. That works fine. What I need is a way to get sentences from those PDF documents that contain a certain keyword. What T-SQL syntax can I use to extract sentences that contain a certain word?
|
|
|
|
|
asimptota777 wrote: What T-SQL syntax can I use to extract sentences that contain a certain word?
USE AdventureWorks2012;
GO
DECLARE @SearchWord nvarchar(30)
SET @SearchWord = N'performance'
SELECT Description
FROM Production.ProductDescription
WHERE FREETEXT(Description, @SearchWord);
From MSDN[^]
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
Ok, I know about FREETEXT. But with this command you get THE WHOLE Description field. I need to get only the sentence that contains a keyword.
Query:
SELECT Description
FROM Production.ProductDescription
WHERE FREETEXT(Description, 'smooth');
Result:
1. Suitable for any type of riding, on or off-road. Fits any budget. Smooth-shifting with a comfortable ride.
2. Top-of-the-line competition mountain bike. Performance-enhancing options include the innovative HL Frame, super-smooth front suspension, and traction for all terrain.
3. Aerodynamic rims for smooth riding.
4. Excellent aerodynamic rims guarantee a smooth ride.
I need result like this:
1. Smooth-shifting with a comfortable ride.
2. Performance-enhancing options include the innovative HL Frame, super-smooth front suspension, and traction for all terrain.
3. Aerodynamic rims for smooth riding.
4. Excellent aerodynamic rims guarantee a smooth ride.
|
|
|
|
|
asimptota777 wrote: Ok, I know about FREETEXT. But with this command you get THE WHOLE Description field. I need to get only the sentence that contains a keyword.
..it doesn't work that way; the document is returned, as it could contain the searched word more than once, in multiple locations. You can easily write some code to find the sentence with the word that was searched and extract it.
asimptota777 wrote: Result:
That's a list of requirements, not a programming question.
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
Dude, I know what full text search is and how it functions. I just don't know how to implement the functionality I've mentioned. I must admit that your signature really suits you...
|
|
|
|
|
asimptota777 wrote: I must admit that your signature really suits you...
Thank you - it was hard to earn that title
You could repost the question with a link to this thread. State in the new post that this wasn't helpful, and someone might come up with something better.
asimptota777 wrote: <layer>Dude, I know what full text search is and how it functions. I just don't know how to implement the functionality I've mentioned.
I did not look at the requirements; technically, you want to retrieve documents based on a searchterm. You get a list of documents that contain that term. You can retrieve the document.
What's keeping you from doing a substring on that document and parse out the line? You can fetch the index where the word is in the document, then you work your way back to the first word with a capital letter, and forward to the first interpunction.
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
Hi all,
Every time my code tries to connect to my 2007 Access database, the message Unrecognized Database Format pops up. I've read that it is usually due to a corrupt database file or opening an old Access database file in newer version of Access Database. I can rule out opening an old file in newer Access because I've created and run the file in Access 2007. I can reasonably assume that since I can open the file, it is not corrupt. If those two aren't the causes what else could be causing it? The connection string in my web.config file takes the following form:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\myFolder\myAccess2007file.accdb;Persist Security Info=False;
I don't know if that has anything to do with it, any suggestion will be greatly appreciated, thanks for your time.
|
|
|
|
|
This information is REALLY old, there used to be an option to compile and compress an Access database. If this option is still available I suggest you use that. Being able to open the database does not mean it is not corrupted.
Never underestimate the power of human stupidity
RAH
|
|
|
|
|
ASPnoob wrote: I can reasonably assume that since I can open the file, it is not corrupt.
Wrong assumption. The database is corrupt; usually when Access terminates within a write - it can not ammend the database with random bites when it comes back online, so, you get the next best thing; you get to see the database, as Access thinks that the underlaying data should be.
So, install Sql Server Express and start upsizing all the tables that are still within the database, and have the Wizard migrate the data with it.
Did your database allow access to multiple users simultaneous?
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
Thank you, everyone for responding. My database was created less than 10 minutes prior to testing it in my code. It only has one table with one field in the table. I just cannot understand how it could be corrupted. I will give everyone's suggestion a try, thanks again for your reply.
|
|
|
|
|
Did you use Microsoft Access in an environment where multiple users would manipulate data at the same time? Is your database on some network-share?
Microsoft Access is a damn good tool to manipulate single-user local-file databases with strong reporting capabilities.
It's not built to be a sharing-facility for data. If that's what you need, you'll need to switch to a real server - otherwise crap like this will happen again.
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
If we are selecting a set of data, and one of the fields is erroring out, with an arithematic over flow error, how to move this row aside and continue processing the remaining rows so that the entire stored procedure does not fail?
More like move this row causing the issue into a table and continue processing remaining rows?
Is there like on error resume next?
I tried a try catch block.
With in my try I had a select statement with a good value and one with a bad value, I wanted the select to print the good and not the bad one.
======================
declare @dummy int
begin try
select
--right('00000000' + cast(cast(round(500000,0)*10000 as int) as varchar(8)),8)
right('00000000' + cast(cast(round(500000,0)*10000 as int) as varchar(8)),8)
select
--right('00000000' + cast(cast(round(500000,0)*10000 as int) as varchar(8)),8)
right('00000000' + cast(cast(round(500000,0)*10000 as int) as varchar(8)),8)
end try
begin catch
set @dummy = 1
end catch
print @dummy
modified 2-Aug-12 11:06am.
|
|
|
|
|
try a bigint, in place of the int.. Care to explain why you are rounding the 50000, and what the casting is all about?
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
To give a bit more history, we are creating this view to feed into another application so it has to be set length.
Suppose this view returns 100 rows, and I am having an arthimetic overflow in on row 56 only. I want to leave 56 and continue from 57. WhenI tried the try catch block, while it catches the error, but on once it hits 56, then the whole thing stops.
|
|
|
|
|
vanikanc wrote: To give a bit more history
Did you try what I suggested? Y/N?
vanikanc wrote: I want to leave 56 and continue from 57.
Filter the row out before casting it into infinity, make the result-variable bigger, or try padding it with zeroes after the cast.
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
I tried with bigint - still errored out.
|
|
|
|
|
vanikanc wrote: I tried with bigint - still errored out.
Was worth a shot.
right('00000000' + cast(cast(round(500000,0)*10000 as int) as varchar(8)),8)
Where is the @dummy used here? 500 000 * 10 000 = 5 000 000 000. That's bigger than an int, and probably wider than eight characters. Some suggestions;
- Lose the round function
- Don't multiply by 10k, pad it with 4 zero's and save the string-representation.
- If the dummy needs be multiplied by 500k, then multiply it by 5 and pad zeroes again
- Don't limit the varchar to eight characters - use varchar(50)
What kind of number are you trying to display? Can you give us an example of a number "before" (the original dummy) and the one "after" (the resulting dummy)?
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
Thank you for all your suggestions!
It is just some inherited code, and really don't know the in and out of it. It has been working for years, so don't want to mess with what comes in and goes out.
There was an error entering the value, it was supposed to be 50 and the user keyed in 500000. So, the business wants us to put aside such errors and continue processing the data. I guess we have to code for human errors!!
Thanks for all your time and suggestions!
|
|
|
|
|
vanikanc wrote: Thank you for all your suggestions!
My pleasure, hope I wasn't too rude.
vanikanc wrote: It is just some inherited code,
Ah, that's always a good one to include on the first post. If we see code, we assume you wrote it, and partially understand it.
vanikanc wrote: There was an error entering the value, it was supposed to be 50 and the user keyed in 500000. So, the business wants us to put aside such errors and continue processing the data.
On Error Resume Next indeed. That's not how engineers work; if it fails, it fails for a reason. It get's corrected or excluded beforehand, not ignored. It's how managers work; if it fails, and nothing is burning, it's not a problem. Just ship the damn product already, we'll fix the bugs later.
Perhaps this would be a good time to add a validator to the entry-field of that user, and sanitize his/her input before it gets into the system. Once the value has been entered, it should be treated as "correct".
You heard about Knight Capital? Seems they had an "On Error Resume Next" idea to, and the algo kept buying stocks in packages of 100 at a time, 20 to 25 times a second - for over an half hour! (Total >440 million* losses - let's just be glad they weren't a hospital and relying on that software)
Ignoring errors is the worst offence in IT; the system could have skipped customer 59 for all we know. The best approach is preached by (forgive me) PHP, and it's called "do or die". Either the app does what it should do, or there's an unexpected exception - and since we cannot guarantee that the we're still working with valid data (unexpected situation, who knows what variables are loaded and not?) we have only one realistic option; let the app die. Terminate.
That's always better than continuing and writing records with an outdated identity-value after an exception, and not nowing that you're corrupting a database that was still correct when the app died.
*) I checked this time whether I should use million or billion.
Bastard Programmer from Hell
if you can't read my code, try converting it here[^]
|
|
|
|
|
Hello Friends
I have 3 select statments I need to write all in one query How, is it works in MS SQL in Management Studio
Select E.FirstName +' ' + E.LastName as Supervisor
INNER JOIN [cgs].dbo.tems h ON(h.Supervisor = E.EmploId)
where h.temId = 336
Select E.FirstName +' ' + E.LastName as Agent
From [cgs].dbo.Employ E with (NoLock)
where E.EmploId = 2305
Select count(*) As Tras
From CGCSLF
Where Dispo = 'Tras'
Select count(*) As Comp
From CGCSLF
Where Dispo = 'Comp'
SELECT AgeID Sum(Hours) As HOURS
FromCGCSLF
Group By AgeID
Raman
Thank you in advance
|
|
|
|
|
I'm unclear what you require.
If you are using the ADO.net classes in SqlClient then you can put all those in one Command -- just separate them with semi-colons (;). You can then use ExecuteReader to get a DataReader to read the results -- use the NextResult method to advance to the results from the next statement. I'm fairly sure that DataAdapters will handle it as well, but I haven't used one for several years.
If you need to do something else, then please clarify your question.
|
|
|
|