Home .NET Entity Framework 6 with Full-Text Search via LINQ

Entity Framework 6 with Full-Text Search via LINQ

by admin

I want to share my crutch to solve rather trivial problem: how to connect MSSQL full-text search to Entity Framework.The topic is very specialized, but in my opinion, it is very relevant today. If you are interested, please khat.

It all started with pain

I develop projects in C# (ASP.NET) and sometimes write microservices. Most of the time I use MSSQL database to work with data. As a link between the database and my project I use Entity Framework. With EF I have a wide range of opportunities to work with data, form the right queries, regulate the server load. The magical LINQ mechanism fascinates me with its possibilities. Years later I can not imagine any faster and more comfortable ways to work with the database. But like almost any ORM, EF has several disadvantages. First it’s performance, but that’s a topic for a separate article. And secondly it is the coverage of the database itself.

MSSQL has a built-in full-text search that works "out of the box". You can use built-in predicates (CONTAINS and FREETEXT) or functions (CONTAINSTABLE and FREETEXTABLE) to perform full-text queries. There is only one problem: EF does not support full-text queries, not at all!

Here is an example from real experience. Let’s say I have an article table – Article, and I create a class for it which describes this table :

/// c#public partial class Article{public intId { get; set; }public System.DateTime Date { get; set; }public string Text { get; set; }public bool Active { get; set; }}

Then I need to do a sampling of these articles, say, display the last 10 published articles :

/// c#dbEntities db = new dbEntities();var articles = db.Article.Where(n => n.Active).OrderByDescending(n => n.Date).Take(10).ToArray();

Everything is very nice until the task of adding full-text search comes up. Since there is no support for full-text fetching functions in EF (in .NET core 2.1 already partially there is ) then you either have to use some third-party library or write the query in pure SQL.

The SQL query from the example above is not that complicated :

SELECT TOP (10)[Extent1].[Id] AS [Id], [Extent1].[Date] AS [Date], [Extent1].[Text] AS [Text], [Extent1].[Active] AS [Active]FROM [dbo].[Article] AS [Extent1]WHERE [Extent1].[Active] = 1ORDER BY [Extent1].[Date] DESC

In real projects, things are not that simple. Database queries are much more complicated and it’s hard and time consuming to manually maintain them. As a result, the first time I wrote a LINQ query, and then I prepared a generated text of SQL query to the database and used it to implement full-text data fetching conditions. Then I sent it to db.Database.SqlQuery and got the data I needed. Of course, all this is good while you don’t need to hang a dozen of different filters with complex join-names and conditions on the query.

So – I have a specific pain. We need to solve it!

In search of a solution

Once again sitting in my favorite search in the hope of finding some solution I stumbled upon this repository With this solution you can implement in LINQ support for predicates (CONTAINS and FREETEXT). Due to the support of EF 6 a special interface IDbCommandInterceptor that allows you to intercept the ready-made SQL query before sending it to the database and this solution was implemented. In the field Contains a special generated token string is substituted in the field, and then after request generation this field is replaced with the predicate Example :

/// c#var text = FullTextSearchModelUtil.Contains("code");db.Tables.Where(c=> c.Fullname.Contains(text));

However, if the data sample needs to be sorted by rank of matches, this solution will not work and you will have to write the SQL query manually. In fact, this solution substitutes the usual LIKE for predicate sampling.

So, at this point I have a question: is it possible to implement a real full-text search using MS SQL built-in functions (CONTAINSTABLE and FREETEXTABLE), to generate it all via LINQ and with sorting support for matching ranks? As it turns out, you can!

Implementation

First, we had to develop the logic for writing the query itself using LINQ. Since real SQL queries with full-text selections most often use JOIN to join a virtual table with ranks, I decided to follow the same path in a LINQ query.

Here is an example of such a LINQ query :

/// c#var queryFts = db.FTS_Int.Where(n => n.Query.Contains(queryText));var query = db.Article.Join(queryFts, article => article.Id, fts => fts.Key, (article, fts) => new{article.Id, article.Text, fts.Key, fts.Rank, }).OrderByDescending(n => n.Rank);

This code could not be compiled yet, but it already solved the problem of sorting the resulting data by rank visually. It remained to implement it in practice.

Additional class FTS_Int used in this query :

/// c#public partial class FTS_Int{public int Key { get; set; }public int Rank { get; set; }public string Query { get; set; }}

The name was not chosen by chance, because the key column in this class must match the key column in the lookup table (in my example with [Article].[Id] type int ). In case you need to query other tables with other types of key columns, I just assumed to copy such a class and create its Key of the type you need.

The condition itself to form a full-text query was supposed to be passed in a variable queryText To form the text of this variable, a separate function was implemented :

/// c#string queryText = FtsSearch.Query(dbContext: db, // the current database context, needed to form correct table namesftsEnum: FtsEnum.CONTAINS, // query type : CONTAINS or FREETEXTtableQuery: typeof(News), // type of the table a query is made fortableFts: typeof(FTS_Int), // Auxiliary class typesearch: "text");// condition of full-text search

Executing a Ready Request and Obtaining Data :

/// c#var result = FtsSearch.Execute(() => query.ToList());

Last function FtsSearch.Execute wrapper is used to temporarily connect the interface IDbCommandInterceptor In the above example, the author has preferred to use the query-swapping algorithm all the time for all the queries. As a result, every query has to look for the right combination after the query-swap mechanism is enabled. I found this option wasteful, so the execution of the data query itself is done in the passed function, which connects the query’s auto-replacement before the call and disables it after the call.

Application

I use autogenerated data model classes from the database using the edmx file. Since the simply generated class FTS_Int. cannot be used in EF due to the lack of necessary metadata in DbContext I created a real table based on his model (if anyone knows a better way, I will be glad to receive your help in comments):

Screenshot of the table created in the edmx file

Entity Framework 6 with Full-Text Search via LINQ

CREATE TABLE [dbo].[FTS_Int] ([Key] INT NOT NULL, [Rank] INT NOT NULL, [Query] NVARCHAR (1) NOT NULL, CONSTRAINT [PK_FTS_Int] PRIMARY KEY CLUSTERED ([Key] ASC));

After that, when we update the edmx file from the database, we add the created table and get its generated class :

/// c#public partial class FTS_Int{public int Key { get; set; }public int Rank { get; set; }public string Query { get; set; }}

This table will not be queried, it is only needed to correctly form the metadata to create a query. Final example of using full-text queries against a database :

/// c#string queryText = FtsSearch.Query(dbContext: db, ftsEnum: FtsEnum.CONTAINS, tableQuery: typeof(Article), tableFts: typeof(FTS_Int), search: "text");var queryFts = db.FTS_Int.Where(n => n.Query.Contains(queryText));var query = db.Article.Where(n => n.Active).Join(queryFts, article => article.Id, fts => fts.Key, (article, fts) => new{article, fts.Rank, }).OrderByDescending(n => n.Rank).Take(10).Select(n => n.article);var result = FtsSearch.Execute(() => query.ToList());

There is also support for asynchronous requests :

/// c#var result = await FtsSearch.ExecuteAsync(async () => await query.ToListAsync());

SQL query generated before auto-replacement :

SELECT TOP (10)[Project1].[Id] AS [Id], [Project1].[Date] AS [Date], [Project1].[Text] AS [Text], [Project1].[Active] AS [Active]FROM ( SELECT[Extent1].[Id] AS [Id], [Extent1].[Date] AS [Date], [Extent1].[Text] AS [Text], [Extent1].[Active] AS [Active], [Extent2].[Rank] AS [Rank]FROM [dbo].[Article] AS [Extent1]INNER JOIN [dbo].[FTS_Int] AS [Extent2] ON [Extent1].[Id] = [Extent2].[Key]WHERE ([Extent1].[Active] = 1) AND ([Extent2].[Query] LIKE @p__linq__0 ESCAPE N'~')) AS [Project1]ORDER BY [Project1].[Rank] DESC

SQL query generated after automatic exchange :

SELECT TOP (10)[Project1].[Id] AS [Id], [Project1].[Date] AS [Date], [Project1].[Text] AS [Text], [Project1].[Active] AS [Active]FROM ( SELECT[Extent1].[Id] AS [Id], [Extent1].[Date] AS [Date], [Extent1].[Text] AS [Text], [Extent1].[Active] AS [Active], [Extent2].[Rank] AS [Rank]FROM [dbo].[Article] AS [Extent1]INNER JOIN CONTAINSTABLE([dbo].[Article], (*), 'text')AS [Extent2] ON [Extent1].[Id] = [Extent2].[Key]WHERE ([Extent1].[Active] = 1) AND (1=1)) AS [Project1]ORDER BY [Project1].[Rank] DESC

By default, the full-text search works for all columns in the table :

CONTAINSTABLE([dbo].[Article], (*), 'text')

If you want to select only some fields, you can specify them in the fields parameter of the function FtsSearch.Query

Total

Result – full-text search support in LINQ.

The nuances of this approach.

  1. Parameter search in the function FtsSearch.Query does not use any checks or wrappers to protect against SQL injection. The value of this variable is passed as is in the query text.If you have any ideas about it, write in comments. I used regular expression which just removes all non alphanumeric characters.

  2. You should also take into account the peculiarities of constructing expressions for full-text queries. Parameter in function

    /* Query with error */CONTAINSTABLE([dbo].[News], (*), 'New Research')

    has an invalid format because MS SQL requires separating words with logical literals. To execute the query successfully you need to fix it this way :

    /* Correct query */CONTAINSTABLE([dbo].[News], (*), 'New and research')

    or change the data sampling function

    /* Correct queryFREETEXTTABLE([dbo].[News], (*), 'New Research')

    For more information about the specifics of creating queries, it is best to refer to official documentation

  3. Standard logging does not work correctly with this solution. A special logger was added for this purpose :

    /// c#db.Database.Log = (val) => Console.WriteLine(val);

    If you look at the generated query to the database, it will be generated before processing by the auto-replacement functions.

During testing, I also tested on more complex queries with multiple selections from different tables and there were no problems here.

Sources at GitHub

You may also like