Presenter:Bob Tabor (from LearnVisualStudio.NET)
Full-Text search allows you to save large portions of text — even Microsoft Word documents, or other file formats — into your database and perform complex queries based on this textual data while expecting great performance. This lesson demonstrates how to set up Full-Text search paying particular attention to the nuances of setting it up within SQL Server Express. Then we move to creating full-text indexes, configuring the index population settings, and performing queries that demonstrate the power of the CONTAINS predicate and its many, many variations.
Click here to go to original video page
The following text is a software generated transcript of the video. Click on a minute link to jump to a location within the video
Minute 0
This video will provide a quick lesson on using the full text indexing and search capabilities of SQL Server 2005 express edition with fulltext searching quickly search through large amounts of character based data fields you can also search through entire documents like Word documents or other office documents HML XML or even if he got the documents that are stored in your database tables not sure you could use the like predicate of the where clause to search through text data but it s not really optimized search through large amounts of data in each field so this is a special scenario that required a special type of solution and the right tool for that job is fulltext search additionally fulltext search allows you to produce queries that rank the results differently based on a number of different criteria for example the weights that you define to each part was
Minute 1
Show or possibly to the proximity to other keywords in your search or even to intellectual forms of specific words and in the end to that embryo look at how to search for the term foot but also get search results that and have other forms of the word like feet for sample while also you can tweak how the indexing is performed you can tweak the thesaurus you can filter out the noise terms like the workflow and the word and so on and quite a bit more unfortunately I can t demonstrate all of this in this lesson possibly get you started with how to set up old text searching a high great catalogs and indexes and several of her types of searches to get you started at the something that you be implementing your own applications will justly want to consult the SQL Server 2005 books online starting with the article does title fulltext search
Minute 2
Fulltext search concepts and rigor and you start out by installing SQL Server 2005 express edition with advanced services first of all let s go to the express edition websites you go to SQL Server express download you ll scroll down to the actual download page and under number three you ll see that on the right hand side there is the ability to download SQL Server 2005 express edition with advanced services SP ones who you ll want to download that additionally you may want to download the adventure work samples so by clicking on the download link inception of her for
Minute 3
Of the adventure Works of DB MSI affords a liberty of downloading both of these and so was this quickly walked through the steps of installing SQL Server advance services and all show you what we ll need to do in order to get this to work correctly and so one of thing somebody was greater new instance on my existing machine all SQL Server 2005 express edition with advance services and will use that for the examples and just adventure how to set up fulltext indexing and searching and will go through the normal steps at a usually take in order to install SQL Server runs its configuration check will want to unhide advanced configuration
Minute 4
Database under feature selection will want to go to the database services in select fulltext search will be installed a local hard drive and slept next and will call this new instance full text search and select the next button and will select the default options for the next couple of screens here will do to the screen that will allow us to select whether or not we want to install user instances if you re not sure what user instances are I would refer you to lesson number seven in this video series but what s important here is that fulltext search is incompatible with user instances when you install SQL express with advance services you can either make sure to deselect the enabled user instances checkbox
Minute 5
Order to turn off for disabled user instances now if you forget to do that or if you want to use fulltext search on an existing database you need to turn the user instances off manually and he can do that with the following T SQL statement to open up a new query window make sure that you re attached to the bench for Works database or whatever database if you re currently working on and run the SP configure and user instances enabled and set that value equal to zero and the remainder of the options we can just select the defaults and I m going to pause recording as it continues to install after completing the installation you may want to set up the adventure Works database the sample database that all use for the next few minutes there are plenty
Minute 6
Columns in the adventure Works database tables that we can create indexes on experiment with stalls merely double click the sidewalk the installation steps for going to the media steps of installing the adventure Works database and so going to shut down my folder and then going to launch SQL Server management studio express are going to connect to my new instance and would attach to the adventure Works database using the steps that we ve seen in previous lessons
Minute 7
Databases and SQL server or fulltext enabled by default unless they re created a by using SQL Server meta studio express to enable a database for fulltext search when you create a database by using management studio to the databases to create a new database and you want to select this useful text indexing option and the continued praying your database with the other options as well to cancel for now now if you want to enable existing database for fulltext search will right click the database select properties include files page on the left hand side and make sure that use full text indexing is turned on as a business case next reluctance to fulltext indexing on the table in our database
Minute 8
In SQL Server 2005 visit to seven step process for small will pray to fulltext catalog in order to store the fulltext indexes and then it will create fulltext indexes so at a high level you have this hierarchy you have a database and a database can contain zero or more fulltext catalogs and each fulltext catalog can contain one or more fulltext indexes known are simple scenario world would create one fulltext catalog and each on current fulltext index of all be put within that fulltext catalog and hopefully the cessation will become clear once the oppressors of the examples of using SQL Server 2005 management studio express unfortunately some of the wizards that help you through the process have not been included so you need to license and install whichever version of SQL Server 2005 in order to
Minute 9
Gain access to some of those tools however you can still use fulltext search bubbles had to write some code by hand and honestly it s not all that complicated once you understand some of the basics so first of all let s create a fulltext catalog and open group window and paste of transact SQL create fulltext catalog in the name of the catalog in this case was to recall something simple like my fulltext catalog and I ll execute this and after a few seconds we d be completed successfully message so what is to will the statement creates the fulltext catalog in the default directories specified during our setup there s a folder now named my fulltext catalog is in the default directory for our instance of SQL Server
Minute 10
The catalog is where all the index files as well as configuration files will be stored on our hard drive SQL Server fulltext search creates indexes which are simply special files that keep account for each word in each document or each column and associate each word back to the record in the database for that work was found so as you can imagine these files can grow quite large so let s open up our Windows Explorer and navigate to the directory where our catalog was created and seen the FT data we now have fulltext catalog and these files that are managed underneath this folder or indexes sources and all
Minute 11
Information that fulltext search needs a run broken social perspective if you liked has more information on the structure of an index go to this document out of all and be MSDN help or find this article within SQL Server books online you can see that it has nice walk through uses an example documents how the index would be paraded for those example documents is an explanation as well as another were on the topic of indexes now that we ve created that fulltext catalog and accepted to create a fulltext index before you print a fulltext index you have to determine whether the table already has a unique single column non noble index of the fulltext engine for
Minute 12
Server which is called the MSFT SQL service uses this unique index to map each row in the table to a unique compressible key factor already have such index you don t have to create another one you don t already have an index that you have a great one and so here s an example out of you for one particular table for a unique index if I call the UI for unique index_product review on tablename which was production product review and never pass in the columns that will create the index on Sonata to have unique key create a fulltext index on the product review table a fulltext index stores information about significant words
Minute 13
And their location within a given document this information is used to quickly compute fulltext queries a search for rows for the particular word or combination of words can create a fulltext index on one or more columns of the table in a database only one fulltext indexes allow per table so first of all let s look at a simple platform of this fulltext index creation is that we have the term full for a fulltext index on this is the tablename is a column that we re going to want to index this is the unique index that we created in the previous step is the name of the catalog and we created earlier and were we want the files the indexes to be up to be stored on our local hard drive
Minute 14
A lot of explanation the process of creating and maintaining a fulltext index is called index population this final clause in our statement here specifies how you plan on initially creating and then maintaining your fulltext index under three different types of modes of population for folders full population actively occurs whenever the fulltext catalog to fulltext indexes first populated the indexes can be maintained using change tracking or incremental populations and unless you tell them not to this happens automatically when every creeping index the second is a voter population is change tracking and so after the initial population is complete and SQL Server can track the changes that are made to your data and propagate those changes to your fulltext index since we have auto turned on
Minute 15
Statement with an agency with change tracking auto on the SQL Server will does handle this for us automatically and finally there s another mode of population that is incremental timestamp based and this allows you to manually kick off an update to only those rows that a change to the last population of date as specified in a special daytime feel that you also have to supply the reason you get this granularity of control is because maintaining a fulltext index can be very processor and file I O intensive operation in fact for this three reasons some organizations on database administrators from a band is completely also some web hosting companies may not allow you to do this abuts assuming that you didn t want to use this you do have a level control and for example you might want to schedule it to perform only at certain times of the day you get to specify how you plan on maintaining that index at the
Minute 16
Time when you created so if you want to use anything else besides auto you basically need to execute and alter statement on the index so that s updated with new values for the columns specified in index so please refer to SQL Server books online for more details about the difference am index population modes and because many uses later I m going to execute this right now and why would you look at a second example of great fulltext index also and more obligated one pace in actually two things here think of it as a spacer here are a fan for statement is going to create unique index for us on the production document table and will want to create that unique index on the document ID column and after we do data going to then create a fulltext index on production that document
Minute 17
You ll see here are the acts we specify the column name to such we did in the produce example but there s also two additional arguments that were passing and the first argument called type column has the the column name file extension so let s do this listen to look at our production document table individuals database good clue about what is actually going on here is a production that document table and tell received we have our document ID and pretty unique index on we have a file named file extension and then also the document stored in for binary max solicited some just for fun must go ahead and open table reviewed to the values that are stable
Minute 18
And so here we can see the title of the document to filename where it s stored on our filesystem file extension you see that this is so these are word documents that are being stored in a document s column of our database or actions for work documents here and index the indexing of a fulltext search and so this file extension column becomes important basically restating here that we want a fulltext search engine to index a binary field that contains a doc file a Word document full text search must know which column in our database will contain the file extension for the file backstage in the document column so uses that file extension to load the appropriate filter for that document type in this case is going to a
Minute 19
Filter for word documents and you might be wondering well what are all the types of documents that you can store with the Navarre binary and expect them to to be indexed by fulltext search and indexing while the students run a quick little query here space in and that information is stored in assistant table call fulltext document types execute this and will see the ads we can index SPX pages as well as ASC ex so controls to create for our ASP net pages act batch files and files on C source files is our dock and thought for templates HTML files INI files and there s quite a few even and three used so playlists PowerPoint
Minute 20
And PowerPoint templates URLs VBScript XML and sure there are ways to set it so that you can print other Palm filters for your own custom file extensions so any rate that s what this line does the next little option here is language zero ex zero and the supplies of the language that should be used when parsing the fulltext column is suffering inside here for a neutral language so OS so is the locale identifier for a neutral language but you can specify a given language for sample if you know you ll be storing documents that are written in French you can specify French year as opposed to just selecting whatever default locale is within
Minute 21
On your installation of SQL Server so based on the language a different set of files or reference for dictionaries and words that are used to parse through the text things like word breakers severs noise words thesaurus and so on amok and explain easily got the SQL Server books online has information about these kinds of dictionaries and these office settings are used to parse through the words in the document you re storing the database or create an index so as to know which dictionary you should use the English to French the German and so on and once it knows that information that can choose the dictionary is appropriate for that language and use the dictionary are those dictionaries are of information they can figure out what terms are nouns which ones are verbs which ones can just be ignored while you re doing or indexing so in this case as a citizen moment ago you can see that if set to zero ex zero and the acts
Minute 22
Is a hexadecimal value that for the neutral language locale identifier soda words in this case were asking you to look at the localization settings or instance of SQL Server which by default is English that is actually quite a few more indexing options once again if your plane using this own purposes in her gently to want to research this more you can learn more about grading the fulltext index at the following URL and paste it here in the article is called create fulltext index pseudo paws jotted down okay so let s go ahead and execute this recreate the unique index to create the fulltext index and seems to work for a okay so once the index is
Minute 23
Builds of statistics are collected for use in ranking and now you can start writing queries to collect information from your index so let s do this first of all a star of a very simple select query that uses a special predicate called contains in the where clause here and select add them to execute again this is probably the simplest type of fulltext great and possibly create others quiet a few other qualifier sophisticate hereto will thought what does this moment but fundamentally we have were were selecting the comments column from our product review table that we board created a fulltext index on and going to search for the term learning curve now we might want switch over here to results attacks that we can see results will the debtors to the
Minute 24
An agency here fooled through the text will find learning curve know as I said a moment ago there s actually quite of few options for the contains predicate and someone referred you yet to another resource on a SQL Server books online or MSDN here you can find the full reference for the contained statement on and it s actually quite a few pages long if you were to try to print this out but a nutshell of the contains predicate can search for a word or phrase the prefix of work or phrase a word that s near another word for word doesn t flush and we generated from another word for example of the work to rise is the inflectional stem of drives drove driving a driven you can also search for word as a synonym of another word using a thesaurus for sampled the word
Minute 25
Battle can have synonym such as aluminum steel is a want to begin attack on some of the additional options of contains predicate you can develop quite powerful of inquiry into your text so there are two basic parts of the contains statements to see here first of all we have a column or you can actually include a list of columns that you d be searching against and those columns have been included in the index that you actually created for your dinner table at and most of the time as a gun here you would probably only specify just one column but it s entirely possible to search gets along with columns so the second part of the contained statement is the actual search condition below is that I have two sets of quotes a double set a quote that surrounds to words which means I will treat these essentially is
Minute 26
Work together and also the miss a single switches defines the entire contains search condition so was little slot will quicken to look at the different kinds of things he can do within this contains search condition first of all you can use a wildcard so in the Stigler case were looking for the term computer star and would prevent computer computers computation and compulsory you can use ant or not and also the binary representations of those ampersands pipe symbols and so on agency others rode to the performance of his last two examples where were using hand search for those terms anywhere in the document as opposed to a complete phrase computer software install in that final example there you can also use the word in year or so were you looking for
Minute 27
And ranking higher than those results with documents that contain the word computer closer to the word software so in this case were beginning to refine our search results not just to bring back any documents that have both the work tutor and software but to rank higher in those documents that were those two words are even closer together that the next will example will take look at our other forms of which allows us to search for inflectional forms of free sampled were happy which would match the word happy happy your happiest and happily and so on and it finally will deliver that some rather complex example of a geek is about phrase and that allows us to specify the weighted term keyword so in this figure case we want to give different weights
Minute 28
Different terms than are search so the work development shows up several times that would outweigh a search result that had perhaps the worst computer and software several times as well just because of the weight has been associated with the work development in this example end I could go on and on and on little lot of other options here I just wanted to point out a few quick examples ones that I saw immediate value in this of the last couple Mr us want to see a couple of these types of queries and actions you don t bring back for us is to experiment with so the first zeppelin will show you a go ahead and he sitting here so in this first example on the show couple things are besides the creation of unique index of the fulltext index
Minute 29
Here comes to the show this simple select statement that uses contains and others to interesting things about this first of all I have a where clause that contains other predicates which I ve never really seen up to this point so pure and filtering out the product description idea five for my results aware product description is greater lesson five and it also might contain statement using the handsome looking for an inclusion of work aluminum and spindle within my results At next week that they can see it brings back description aluminum alloy cops and large diameter spindle so even though these are again not right next to each other using he and you can search for them within the same document are a next example is private or comp
Minute 30
As a basis for some also pretty unique index on the person address table and create a fulltext index is well on the address of one of column and now in the SQL statement self aware point out there are two things about this query that are interesting for us although several years he is about phrase in order to wait different items within the search results differently was a something row quickly forget to far less go to the person address table with columns so our address line 1 is what will be using here is type and parks are 60 and select execute statement
Minute 31
And we will use that it s a constant lookout different rankings so he is about if you recall from a few months ago will allow skewed differently to different values in this case for modifying using a hand the wildcard symbol to say if its day in street and give at the highest weights day and then followed with a work view and give its some weights but will live less than if you were followed by street and then all the other ones can be just you know given equal weight so we have this first one based reads with higher rank values will get back to what this is just a moment in the Bay view has a lesser rank and finally we have Peter Corr s a very Bayshore and they have
Minute 32
Less rank okay so the first part of this is usually is about statement the second part is where Mrs rank value will notice that were not using contains were using a variation on it called contains table hands it basically creates a temporary table that contains the tables key column in a column called brain in the rank columns a value from zero through 1000 for each row that indicates how well rolled match the selection criteria so in this particular case we are doing it in her joint with his temporary table called tea table and gratuitous sex wares be addressed out address ID of the primary key equals this temporary table called tea table and refining right here and for exhibit equal to the key of acting so that we can join those two together in a reason for joining
Minute 33
The leading display both the address line 1 and the ranking within our results then forgot you set ranking to order by a column in a dissenting fashion the base of the free complex where he was going to want a slight less advanced to reuse that forms a key word in order to get inflectional forms of foot so let s execute that one and will see some of the results that are being brought back here and see the fine forms of foot a place I can find work foot here let s see we have a 1 feet okay so inflectional forms again will give us not only those results that contain our search criteria but also any forms of that work as well
Minute 34
Okay in a finale going to paste in another brief action convoluted long bit of code here and will attempt to do in this is to update a fulltext index by inserting a new row into the database specifically into the product review table and if we set the fulltext index correctly on that table and we set it up with the change tracking set to auto as we discussed earlier then whenever we get down later in the query of a forcible wait for 30 seconds I should give the index plenty of time to index this new row data and they were going to try to find it within our database by crafting a SQL statement that specifically geared grabbing out those elements that we would expect to see in a table so let s go ahead and select all this
Minute 35
And then select the escape on hands free to wait for about 30 seconds so I m when a pause recording and after 30 seconds you can see that we did get a lot of the row returned for to us within our within our select statement here that looks for in the comments where we have the worst venture works in Redmond and Tabor and so we were able to successfully execute a fulltext index a query on a new role of data that was indexed correctly we gave us some time to execute it so very cool if you don t get the results that you re expecting you want to ensure that you ve follow the steps correctly in setting up your fulltext indexes also making sure that fulltext indexer she is turned off her database in that user instances are turned off you would obviously want to
Minute 36
Dolts and do some testing with on taking a look at your event viewer and looking in the application session to be see if there is any information there are any warnings for a more errors that may have occurred you also what makes sure to check and school administrator tools here and check the services and it s a service that is being used for fulltext index searching is the SQL Server fulltext search physical properties of the name of it is the MS FT SQL exe are you met Marla make sure that it is using the correct account to log into your instance of SQL Server and finally he
Minute 37
Realm of debugging with students and close down then let s open up whose Explorer and we re going to take a look at where Reagan s final law asked for fulltext index program files produce SQL Server and look at module releases SQL Server go to him the MS SQL folder to the log folder and you ll see here SQL FT with the numbers to the right of it log and you can open this up into Notepad and view all the messages from fulltext auto population from the indexing critic for your for your table and let me conclude by saying that this was not
Minute 38
Intended to be a comprehensive look at fulltext indexing and fulltext searching is quite a bit more but hopefully this will be start off in the right direction hoping to the studio did please visit to learn Visual Studio net where you can download over five restraint and videos on all topics related to net see sharp Visual Basic ASP net even SQL Server and a lot more thanks every day
Thursday, May 29, 2008
Video: Enabling Full-Text Search in your Text Data
Labels:
full text search
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment