Using graph neural networks to recommend related products

January 21, 2024

[ad_1]

Recommending related products — say, a phone case to go along with a new phone — is a fundamental capability of e-commerce sites, one that saves customers time and leads to more satisfying shopping experiences.

At this year’s European Conference on Machine Learning (ECML), my colleagues and I presented a new way to recommend related products, which uses graph neural networks on directed graphs.

In experiments, we found that our approach outperformed state-of-the-art baselines by 30% to 160%, as measured by hit rate and mean reciprocal rank, both of which compare model predictions to actual customer co-purchases. We have begun to deploy this model in production.

Graph building

In our product graph, the nodes represent products, and the node data consists of product metadata — product name, product type, product description, and so on. To add directional edges to the graph, we use co-purchase data, or data on which products tend to be purchased together. These edges may be unidirectional, as when, say, one product is an accessory of another, or bidirectional, if products are co-purchased, but neither depends on the other.

In this simplified graph, orange edges (which may be unidirectional or bidirectional) represent product co-purchases, and red edges (which are always bidirectional) represent similarity.

This approach, however, runs the risk of introducing selection bias into the model. In this context, selection bias occurs when customers’ preferential selection of one product reflects greater exposure to that product. To offset that risk, our graph also includes bidirectional edges that we derive from co-view data, or data on which products tend to be viewed together under the same product query. Essentially, the co-view data helps us identify products that are similar to each other.

The product graph thus has two types of edges: edges indicating co-purchases and edges indicating similarity.

GNN embeddings

For each node in the product graph, the GNN produces an embedding, which captures information about the node’s immediate vicinity. We use two-hop embeddings, meaning they factor in information about both a node’s immediate neighbors and those nodes’ neighbors.

Related content

Three papers at CVPR present complementary methods to improve product discovery.

The key to our model is the procedure for generating separate source and target embeddings. For each node, the source embedding factors in all the node’s similarity relationships but only its outbound co-purchase relationships. Contrarily, the target embedding factors in all the node’s similarity relationships but only the inbound co-purchase relationships.

The GNN is multilayered, and each layer takes in the node representations produced by the layer below and outputs new node representations. At the first layer, the representations are simply the product metadata, so the source and target embeddings are the same. Beginning at the second layer, however, the source and target embeddings diverge.

Thereafter, the source embedding for each node factors in the target embeddings of the nodes with which it has outbound co-purchase relationships and the source embeddings of the nodes with which it has similarity relationships. The target embedding for each node factors in the source embeddings of the nodes with which it has inbound co-purchase relationships and the target embeddings of the similar nodes.

The dual embeddings (right) corresponding to the sample product graph (left). The suffix “-s” indicates a source embedding, the suffix “-t” a target embedding.

We train the GNN in a self-supervised way using contrastive learning, which pulls the embedding of a given node and those that share edges with it together, while pushing apart the embedding of the given node and a randomly selected, unconnected node. A term of the loss function also enforces the asymmetry in the source and target embeddings, promoting the incorporation of information about target nodes connected by outbound edges and penalizing the incorporation of information about target nodes connected by inbound edges.

Once the GNN is trained, selecting the k best related products to recommend is simply a matter of identifying the k nodes closest to the source node in the embedding space. In experiments, we compared our approach to its two best-performing predecessors, using hit rate and mean reciprocal rank for the top 5, 10, and 20 recommendations, on two different datasets, for 12 experiments in all. We found that our method outperformed the benchmarks across the board — often by a large margin. You can find more details in our paper.

[ad_2]

Source link

Using graph neural networks to recommend related products

Amazon and Virginia Tech announce inaugural fellowship and faculty research award recipients

Fall 2021 and winter 2022 Amazon Research Awards recipients announced

Using Amazon web traffic to track the eclipse

A quick guide to Amazon’s 20+ papers at ICASSP 2024

Play the latest Prime games – Fallout 3 and Fallout: New Vegas

Help support the National Park Foundation by watching select Prime Video content on Fire TV.

Leave a reply Cancel reply