Search Results for

    Show / Hide Table of Contents

    Full Text Search

    PostgreSQL has built-in support for full-text search, which allows you to conveniently and efficiently query natural language documents.

    Mapping

    PostgreSQL full text search types are mapped onto .NET types built-in to Npgsql. The tsvector type is mapped to NpgsqlTsVector and tsquery is mapped to NpgsqlTsQuery. This means you can use properties of type NpgsqlTsVector directly in your model to create tsvector columns. The NpgsqlTsQuery type on the other hand, is used in LINQ queries.

    public class Product
    {
        public int Id { get; set; }
        public string Title { get; set; }
        public string Description { get; set; }
        public NpgsqlTsVector SearchVector { get; set; }
    }
    

    Setting up and querying a full text search index on an entity

    As the PostgreSQL documentation explains, full-text search requires an index to run efficiently. This section will show two ways to do this, each having its benefits and drawbacks. Please read the PostgreSQL docs for more information on the two different approaches.

    Method 1: tsvector column

    This method adds a tsvector column to your table, that is automatically updated when the row is modified. First, add an NpgsqlTsVector property to your entity:

    public class Product
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string Description { get; set; }
        public NpgsqlTsVector SearchVector { get; set; }
    }
    

    Setting up the column to be auto-updated depends on your PostgreSQL version. On PostgreSQL 12 and above, the column can be a simple generated column, and version 5.0.0 contains sugar for setting that up. In previous versions, you must manually set up database triggers that update the column instead.

    • PostgreSQL 12+
    • Older Versions
    Note

    The below only works on PostgreSQL 12 and version 5.0.0 of the EF Core provider.

    The following will set up a generated tsvector column, over which you can easily create an index:

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Product>()
            .HasGeneratedTsVectorColumn(
                p => p.SearchVector,
                "english",  // Text search config
                p => new { p.Name, p.Description })  // Included properties
            .HasIndex(p => p.SearchVector)
            .HasMethod("GIN"); // Index method on the search vector (GIN or GIST)
    }
    

    First, modify the OnModelCreating() of your context class to add an index as follows:

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Product>()
            .HasIndex(p => p.SearchVector)
            .HasMethod("GIN"); // Index method on the search vector (GIN or GIST)
    }
    

    Now generate a migration (dotnet ef migrations add ....), and open it with your favorite editor, adding the following:

    public partial class CreateProductTable : Migration
    {
        protected override void Up(MigrationBuilder migrationBuilder)
        {
            // Migrations for creation of the column and the index will appear here, all we need to do is set up the trigger to update the column:
    
            migrationBuilder.Sql(
                @"CREATE TRIGGER product_search_vector_update BEFORE INSERT OR UPDATE
                  ON ""Products"" FOR EACH ROW EXECUTE PROCEDURE
                  tsvector_update_trigger(""SearchVector"", 'pg_catalog.english', ""Name"", ""Description"");");
    
            // If you were adding a tsvector to an existing table, you should populate the column using an UPDATE
            // migrationBuilder.Sql("UPDATE \"Products\" SET \"Name\" = \"Name\";");
        }
    
        protected override void Down(MigrationBuilder migrationBuilder)
        {
            // Migrations for dropping of the column and the index will appear here, all we need to do is drop the trigger:
            migrationBuilder.Sql("DROP TRIGGER product_search_vector_update");
        }
    }
    

    Once your auto-updated tsvector column is set up, any inserts or updates on the Products table will now update the SearchVector column and maintain it automatically. You can query it as follows:

    var context = new ProductDbContext();
    var npgsql = context.Products
        .Where(p => p.SearchVector.Matches("Npgsql"))
        .ToList();
    

    Method 2: Expression index

    Version 5.0.0 of the provider includes sugar for defining the appropriate expression index; if you're using an older version, you'll have to define a raw SQL migration yourself.

    • Version 5.0.0
    • Older Versions
    modelBuilder.Entity<Blog>()
        .HasIndex(b => new { b.Title, b.Description })
        .HasMethod("GIN")
        .IsTsVectorExpressionIndex("english");
    

    Create a migration which will contain the index creation SQL (dotnet ef migrations add ...). At this point, open the generated migration with your editor and add the following:

    protected override void Up(MigrationBuilder migrationBuilder)
    {
        migrationBuilder.Sql(@"CREATE INDEX fts_idx ON ""Product"" USING GIN (to_tsvector('english', ""Name"" || ' ' || ""Description""));");
    }
    
    protected override void Down(MigrationBuilder migrationBuilder)
        migrationBuilder.Sql(@"DROP INDEX fts_idx;");
    }
    

    Once the index is created on the Title and Description columns, you can query as follows:

    var context = new ProductDbContext();
    var npgsql = context.Products
        .Where(p => EF.Functions.ToTsVector("english", p.Title + " " + p.Description)
            .Matches("Npgsql"))
        .ToList();
    

    Computed column over JSON columns

    Starting with 7.0, the provider can also create computed tsvector columns over JSON columns. Simply use HasGeneratedTsVectorColumn() as shown above, and when applied to JSON columns, the provider will automatically generate json_to_tsvector/jsonb_to_tsvector as appropriate.

    Note that this will pass the filter all to these functions, meaning that all values in the JSON document will be included. To customize the filter - or to create the computed column on older versions of the provider - simply specify the function yourself via HasComputedColumnSql.

    Operation translation

    Almost all PostgreSQL full text search functions can be called through LINQ queries. All supported EF Core LINQ methods are defined in extension classes in the Microsoft.EntityFrameworkCore namespace, so simply referencing the Npgsql provider will light up these methods. The following table lists all supported operations; if an operation you need is missing, please open an issue to request for it.

    .NET SQL
    EF.Functions.ToTsVector(string) to_tsvector(string)
    EF.Functions.ToTsVector("english", string) to_tsvector('english'::regconfig, string)
    EF.Functions.ToTsQuery(string)) to_tsquery(string)
    EF.Functions.ToTsQuery("english", string ) to_tsquery('english'::regconfig, string)
    EF.Functions.PlainToTsQuery(string) plainto_tsquery(string)
    EF.Functions.PlainToTsQuery("english", string) plainto_tsquery('english'::regconfig, string)
    EF.Functions.PhraseToTsQuery(string) phraseto_tsquery(string)
    EF.Functions.PhraseToTsQuery("english", string) phraseto_tsquery('english'::regconfig, string)
    EF.Functions.WebSearchToTsQuery(string) websearch_to_tsquery(string)
    EF.Functions.WebSearchToTsQuery("english", string) websearch_to_tsquery('english'::regconfig, string)
    EF.functions.ArrayToTsVector(new[] { "a", "b" }) array_to_tsvector(ARRAY['a', 'b'])
    NpgsqlTsVector.Parse(string) CAST(string AS tsvector)
    NpgsqlTsQuery.Parse(string) CAST(queryString AS tsquery)
    tsvector.Matches(string) tsvector @@ plainto_tsquery(string)
    tsvector.Matches(tsquery) tsvector @@ tsquery
    tsquery1.And(tsquery2) tsquery1 && tsquery2
    tsquery1.Or(tsquery2) tsquery1 || tsquery2
    tsquery.ToNegative() !! tsquery
    tsquery1.Contains(tsquery2) tsquery1 @> tsquery2
    tsquery1.IscontainedIn(tsquery2) tsquery1 <@ tsquery2
    tsquery.GetNodeCount() numnode(query)
    tsquery.GetQueryTree() querytree(query)
    tsquery.GetResultHeadline("a b c") ts_headline('a b c', query)
    tsquery.GetResultHeadline("a b c", "MinWords=1, MaxWords=2") ts_headline('a b c', query, 'MinWords=1, MaxWords=2')
    tsquery.Rewrite(targetQuery, substituteQuery) ts_rewrite(to_tsquery(tsquery), to_tsquery(targetQuery), to_tsquery(substituteQuery))
    tsquery1.ToPhrase(tsquery2) tsquery_phrase(tsquery1, tsquery2)
    tsquery1.ToPhrase(tsquery2, distance) tsquery_phrase(tsquery1, tsquery2, distance)
    tsvector1.Concat(tsvector2) tsvector1 || tsvector2
    tsvector.Delete("x") ts_delete(tsvector, 'x')
    tsvector.Delete(new[] { "x", "y" }) ts_delete(tsvector, ARRAY['x', 'y'])
    tsvector.Filter(new[] { "x", "y" }) ts_filter(tsvector, ARRAY['x', 'y'])
    tsvector.GetLength() length(tsvector)
    tsvector.Rank(tsquery) ts_rank(tsvector, tsquery)
    tsvector.RankCoverDensity(tsquery) ts_rank_cd(tsvector, tsquery)
    tsvector.SetWeight(NpgsqlTsVector.Lexeme.Weight.A) setweight(tsvector, 'A')
    tsvector.ToStripped() strip(tsvector)
    EF.Functions.Unaccent(string) unaccent(string)
    EF.Functions.Unaccent(regdictionary, string) unaccent(regdictionary, string)
    • Improve this Doc
    In This Article
    Back to top © Copyright 2022 The Npgsql Development Team