Advanced Query Types#

In this notebook, we will explore advanced query types available in RedisVL:

  1. TextQuery: Full text search with advanced scoring

  2. AggregateHybridQuery and HybridQuery: Combines text and vector search for hybrid retrieval

  3. MultiVectorQuery: Search over multiple vector fields simultaneously

These query types are powerful tools for building sophisticated search applications that go beyond simple vector similarity search.

Prerequisites:

  • Ensure redisvl is installed in your Python environment.

  • Have a running instance of Redis Stack or Redis Cloud.

  • For HybridQuery, we will need Redis >= 8.4.0 and redis-py >= 7.1.0.

Setup and Data Preparation#

First, let’s create a schema and prepare sample data that includes text fields, numeric fields, and vector fields.

import numpy as np
from jupyterutils import result_print

# Sample data with text descriptions, categories, and vectors
data = [
    {
        'product_id': 'prod_1',
        'brief_description': 'comfortable running shoes for athletes',
        'full_description': 'Engineered with a dual-layer EVA foam midsole and FlexWeave breathable mesh upper, these running shoes deliver responsive cushioning for long-distance runs. The anatomical footbed adapts to your stride while the carbon rubber outsole provides superior traction on varied terrain.',
        'category': 'footwear',
        'price': 89.99,
        'rating': 4.5,
        'text_embedding': np.array([0.1, 0.2, 0.1], dtype=np.float32).tobytes(),
        'image_embedding': np.array([0.8, 0.1], dtype=np.float32).tobytes(),
    },
    {
        'product_id': 'prod_2',
        'brief_description': 'lightweight running jacket with water resistance',
        'full_description': 'Stay protected with this ultralight 2.5-layer DWR-coated shell featuring laser-cut ventilation zones and reflective piping for low-light visibility. Packs into its own chest pocket and weighs just 4.2 oz, making it ideal for unpredictable weather conditions.',
        'category': 'outerwear',
        'price': 129.99,
        'rating': 4.8,
        'text_embedding': np.array([0.2, 0.3, 0.2], dtype=np.float32).tobytes(),
        'image_embedding': np.array([0.7, 0.2], dtype=np.float32).tobytes(),
    },
    {
        'product_id': 'prod_3',
        'brief_description': 'professional tennis racket for competitive players',
        'full_description': 'Competition-grade racket featuring a 98 sq in head size, 16x19 string pattern, and aerospace-grade graphite frame that delivers explosive power with pinpoint control. Tournament-approved specs include 315g weight and 68 RA stiffness rating for advanced baseline play.',
        'category': 'equipment',
        'price': 199.99,
        'rating': 4.9,
        'text_embedding': np.array([0.9, 0.1, 0.05], dtype=np.float32).tobytes(),
        'image_embedding': np.array([0.1, 0.9], dtype=np.float32).tobytes(),
    },
    {
        'product_id': 'prod_4',
        'brief_description': 'yoga mat with extra cushioning for comfort',
        'full_description': 'Premium 8mm thick TPE yoga mat with dual-texture surface - smooth side for hot yoga flow and textured side for maximum grip during balancing poses. Closed-cell technology prevents moisture absorption while alignment markers guide proper positioning in asanas.',
        'category': 'accessories',
        'price': 39.99,
        'rating': 4.3,
        'text_embedding': np.array([0.15, 0.25, 0.15], dtype=np.float32).tobytes(),
        'image_embedding': np.array([0.5, 0.5], dtype=np.float32).tobytes(),
    },
    {
        'product_id': 'prod_5',
        'brief_description': 'basketball shoes with excellent ankle support',
        'full_description': 'High-top basketball sneakers with Zoom Air units in forefoot and heel, reinforced lateral sidewalls for explosive cuts, and herringbone traction pattern optimized for hardwood courts. The internal bootie construction and extended ankle collar provide lockdown support during aggressive drives.',
        'category': 'footwear',
        'price': 139.99,
        'rating': 4.7,
        'text_embedding': np.array([0.12, 0.18, 0.12], dtype=np.float32).tobytes(),
        'image_embedding': np.array([0.75, 0.15], dtype=np.float32).tobytes(),
    },
    {
        'product_id': 'prod_6',
        'brief_description': 'swimming goggles with anti-fog coating',
        'full_description': 'Low-profile competition goggles with curved polycarbonate lenses offering 180-degree peripheral vision and UV protection. Hydrophobic anti-fog coating lasts 10x longer than standard treatments, while the split silicone strap and interchangeable nose bridges ensure a watertight, custom fit.',
        'category': 'accessories',
        'price': 24.99,
        'rating': 4.4,
        'text_embedding': np.array([0.3, 0.1, 0.2], dtype=np.float32).tobytes(),
        'image_embedding': np.array([0.2, 0.8], dtype=np.float32).tobytes(),
    },
]

Define the Schema#

Our schema includes:

  • Tag fields: product_id, category

  • Text fields: brief_description and full_description for full-text search

  • Numeric fields: price, rating

  • Vector fields: text_embedding (3 dimensions) and image_embedding (2 dimensions) for semantic search

schema = {
    "index": {
        "name": "advanced_queries",
        "prefix": "products",
        "storage_type": "hash",
    },
    "fields": [
        {"name": "product_id", "type": "tag"},
        {"name": "category", "type": "tag"},
        {"name": "brief_description", "type": "text"},
        {"name": "full_description", "type": "text"},
        {"name": "price", "type": "numeric"},
        {"name": "rating", "type": "numeric"},
        {
            "name": "text_embedding",
            "type": "vector",
            "attrs": {
                "dims": 3,
                "distance_metric": "cosine",
                "algorithm": "flat",
                "datatype": "float32"
            }
        },
        {
            "name": "image_embedding",
            "type": "vector",
            "attrs": {
                "dims": 2,
                "distance_metric": "cosine",
                "algorithm": "flat",
                "datatype": "float32"
            }
        }
    ],
}

Create Index and Load Data#

from redisvl.index import SearchIndex

# Create the search index
index = SearchIndex.from_dict(schema, redis_url="redis://localhost:6379")

# Create the index and load data
index.create(overwrite=True)
keys = index.load(data)

print(f"Loaded {len(keys)} products into the index")
Loaded 6 products into the index

#

Reciprocal Rank Fusion (RRF)#

In addition to combining scores using a linear combination, HybridQuery also supports reciprocal rank fusion (RRF) for combining scores. This method is useful when you want to combine scores giving more weight to the top results from each query.

HybridQuery allows for the following parameters to be specified for RRF:

  • rrf_window: The window size to use for the RRF combination method. Limits the fusion scope.

  • rrf_constant: The constant to use for the RRF combination method. Controls the decay of rank influence.

AggregateHybridQuery does not support RRF, and only supports a linear combination of scores.

if HYBRID_SEARCH_AVAILABLE:
    rrf_query = HybridQuery(
        text="comfortable",
        text_field_name="brief_description",
        vector=[0.15, 0.25, 0.15],
        vector_field_name="text_embedding",
        combination_method="RRF",
        return_fields=["product_id", "brief_description"],
        num_results=3,
        yield_text_score_as="text_score",
        yield_vsim_score_as="vector_similarity",
        yield_combined_score_as="hybrid_score",
    )

    results = index.query(rrf_query)
    result_print(results)

else:
    print("Hybrid search is not available in this version of Redis/redis-py.")
text_scoreproduct_idbrief_descriptionvector_similarityhybrid_score
1.63406294896prod_4yoga mat with extra cushioning for comfort1.00000005960.032266458496
1.63406294896prod_4yoga mat with extra cushioning for comfort1.00000005960.0317540322581
3.37571305608prod_1comfortable running shoes for athletes0.9980582594870.0313188157573

Hybrid Query with Filters#

You can also combine hybrid search with filters:

if HYBRID_SEARCH_AVAILABLE:
    # Hybrid search with a price filter
    filtered_hybrid_query = HybridQuery(
        text="professional equipment",
        text_field_name="brief_description",
        vector=[0.9, 0.1, 0.05],
        vector_field_name="text_embedding",
        filter_expression=Num("price") > 100,
        return_fields=["product_id", "brief_description", "category", "price"],
        num_results=5,
        combination_method="LINEAR",
        yield_text_score_as="text_score",
        yield_vsim_score_as="vector_similarity",
        yield_combined_score_as="hybrid_score",
    )

    results = index.query(filtered_hybrid_query)
    result_print(results)

else:
    print("Hybrid search is not available in this version of Redis/redis-py.")
text_scoreproduct_idbrief_descriptioncategorypricevector_similarityhybrid_score
3.30321812336prod_3professional tennis racket for competitive playersequipment199.991.00000005961.69096547873
3.30321812336prod_3professional tennis racket for competitive playersequipment199.991.00000005961.69096547873
0prod_2lightweight running jacket with water resistanceouterwear129.990.7941712737080.555919891596
0prod_5basketball shoes with excellent ankle supportfootwear139.990.7941712737080.555919891596
0prod_2lightweight running jacket with water resistanceouterwear129.990.7941712737080.555919891596
# Hybrid search with a price filter
filtered_hybrid_query = AggregateHybridQuery(
    text="professional equipment",
    text_field_name="brief_description",
    vector=[0.9, 0.1, 0.05],
    vector_field_name="text_embedding",
    filter_expression=Num("price") > 100,
    return_fields=["product_id", "brief_description", "category", "price"],
    num_results=5
)

results = index.query(filtered_hybrid_query)
result_print(results)
vector_distanceproduct_idbrief_descriptioncategorypricevector_similaritytext_scorehybrid_score
-1.19209289551e-07prod_3professional tennis racket for competitive playersequipment199.991.00000005963.303218123361.69096547873
-1.19209289551e-07prod_3professional tennis racket for competitive playersequipment199.991.00000005963.303218123361.69096547873
0.411657452583prod_2lightweight running jacket with water resistanceouterwear129.990.79417127370800.555919891596
0.411657452583prod_5basketball shoes with excellent ankle supportfootwear139.990.79417127370800.555919891596
0.411657452583prod_2lightweight running jacket with water resistanceouterwear129.990.79417127370800.555919891596

Using Different Text Scorers#

Hybrid queries support the same text scoring algorithms as TextQuery:

if HYBRID_SEARCH_AVAILABLE:
    # Aggregate Hybrid query with TFIDF scorer
    hybrid_tfidf = HybridQuery(
        text="shoes support",
        text_field_name="brief_description",
        vector=[0.12, 0.18, 0.12],
        vector_field_name="text_embedding",
        text_scorer="TFIDF",
        return_fields=["product_id", "brief_description"],
        num_results=3,
        combination_method="LINEAR",
        yield_text_score_as="text_score",
        yield_vsim_score_as="vector_similarity",
        yield_combined_score_as="hybrid_score",
    )

    results = index.query(hybrid_tfidf)
    result_print(results)

else:
    print("Hybrid search is not available in this version of Redis/redis-py.")
text_scoreproduct_idbrief_descriptionvector_similarityhybrid_score
2.66666666667prod_1comfortable running shoes for athletes0.9950737357141.496551615
2.66666666667prod_1comfortable running shoes for athletes0.9950737357141.496551615
1.33333333333prod_5basketball shoes with excellent ankle support11.1
# Aggregate Hybrid query with TFIDF scorer
hybrid_tfidf = AggregateHybridQuery(
    text="shoes support",
    text_field_name="brief_description",
    vector=[0.12, 0.18, 0.12],
    vector_field_name="text_embedding",
    text_scorer="TFIDF",
    return_fields=["product_id", "brief_description"],
    num_results=3
)

results = index.query(hybrid_tfidf)
result_print(results)
vector_distanceproduct_idbrief_descriptionvector_similaritytext_scorehybrid_score
0prod_5basketball shoes with excellent ankle support141.9
0prod_2lightweight running jacket with water resistance100.7
0prod_2lightweight running jacket with water resistance100.7

Runtime Parameters for Vector Search Tuning#

Important: AggregateHybridQuery uses FT.AGGREGATE commands which do NOT support runtime parameters.

Runtime parameters (such as ef_runtime for HNSW indexes or search_window_size for SVS-VAMANA indexes) are only supported with FT.SEARCH (and partially FT.HYBRID) commands.

For runtime parameter support, use HybridQuery, VectorQuery, or VectorRangeQuery instead:

  • HybridQuery: Supports ef_runtime for HNSW indexes

  • VectorQuery: Supports all runtime parameters (HNSW and SVS-VAMANA)

  • VectorRangeQuery: Supports all runtime parameters (HNSW and SVS-VAMANA)

  • AggregateHybridQuery: Does NOT support runtime parameters (uses FT.AGGREGATE)

See the Runtime Parameters section earlier in this notebook for examples of using runtime parameters with VectorQuery.

Comparing Query Types#

Let’s compare the three query types side by side:

# TextQuery - keyword-based search
text_q = TextQuery(
    text="shoes",
    text_field_name="brief_description",
    return_fields=["product_id", "brief_description"],
    num_results=3
)

print("TextQuery Results (keyword-based):")
result_print(index.query(text_q))
print()
TextQuery Results (keyword-based):
scoreproduct_idbrief_description
2.9647332596813154prod_1comfortable running shoes for athletes
2.9647332596813154prod_1comfortable running shoes for athletes
2.148612199701887prod_5basketball shoes with excellent ankle support

if HYBRID_SEARCH_AVAILABLE:
    # HybridQuery - combines text and vector search
    hybrid_q = HybridQuery(
        text="shoes",
        text_field_name="brief_description",
        vector=[0.1, 0.2, 0.1],
        vector_field_name="text_embedding",
        return_fields=["product_id", "brief_description"],
        num_results=3,
        combination_method="LINEAR",
        yield_text_score_as="text_score",
        yield_vsim_score_as="vector_similarity",
        yield_combined_score_as="hybrid_score",
    )

    results = index.query(hybrid_q)

else:
    hybrid_q = AggregateHybridQuery(
        text="shoes",
        text_field_name="brief_description",
        vector=[0.1, 0.2, 0.1],
        vector_field_name="text_embedding",
        return_fields=["product_id", "brief_description"],
        num_results=3,
    )

    results = index.query(hybrid_q)


print(f"{hybrid_q.__class__.__name__} Results (text + vector):")
result_print(results)
print()
HybridQuery Results (text + vector):
text_scoreproduct_idbrief_descriptionvector_similarityhybrid_score
2.96473325968prod_1comfortable running shoes for athletes0.9999999701981.58941995704
2.96473325968prod_1comfortable running shoes for athletes0.9999999701981.58941995704
2.1486121997prod_5basketball shoes with excellent ankle support0.9950737357141.34113527491

# MultiVectorQuery - searches multiple vector fields
mv_text = Vector(
    vector=[0.1, 0.2, 0.1],
    field_name="text_embedding",
    dtype="float32",
    weight=0.5
)

mv_image = Vector(
    vector=[0.8, 0.1],
    field_name="image_embedding",
    dtype="float32",
    weight=0.5
)

multi_q = MultiVectorQuery(
    vectors=[mv_text, mv_image],
    return_fields=["product_id", "brief_description"],
    num_results=3
)

print("MultiVectorQuery Results (multiple vectors):")
result_print(index.query(multi_q))
MultiVectorQuery Results (multiple vectors):
distance_0distance_1product_idbrief_descriptionscore_0score_1combined_score
5.96046447754e-085.96046447754e-08prod_1comfortable running shoes for athletes0.9999999701980.9999999701980.999999970198
5.96046447754e-085.96046447754e-08prod_1comfortable running shoes for athletes0.9999999701980.9999999701980.999999970198
0.009852528572080.00266629457474prod_5basketball shoes with excellent ankle support0.9950737357140.9986668527130.996870294213

Best Practices#

When to Use Each Query Type:#

  1. TextQuery:

    • When you need precise keyword matching

    • For traditional search engine functionality

    • When text relevance scoring is important

    • Example: Product search, document retrieval

  2. HybridQuery:

    • When you want to combine keyword and semantic search

    • For improved search quality over pure text or vector search

    • When you have both text and vector representations of your data

    • Example: E-commerce search, content recommendation

  3. MultiVectorQuery:

    • When you have multiple types of embeddings (text, image, audio, etc.)

    • For multi-modal search applications

    • When you want to balance multiple semantic signals

    • Example: Image-text search, cross-modal retrieval

# Cleanup
index.delete()