paint-brush
An Awesome Tool To Quickly Create An Amazon-Like Recommendation Engine by@MichaelB
176 reads

An Awesome Tool To Quickly Create An Amazon-Like Recommendation Engine

by MichaelOctober 16th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Dgraph's Slash GraphQL product is a hosted, hosted backend service. It's a hosted native GraphQL solution and can be enabled by using the following link. The service was just released on September 10th, 2020 and offers a free trial for $9.99/mo flat fee for up to 5GB data. Users can use the Spring Boot application to create a simple RESTful recommendation service. The user is represented by a very simple Customer object: private String username; private Customer by; private Artist about; }

People Mentioned

Mention Thumbnail
Mention Thumbnail

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - An Awesome Tool To Quickly Create An Amazon-Like Recommendation Engine
Michael HackerNoon profile picture

[TL;DR; Get started using Dgraph's Slash GraphQL product and connect to a Spring Boot application which will act as a simple RESTful recommendation service.]

Back in the early 2000s, I was working on a project implementing an eCommerce solution by Art Technology Group (ATG), now owned by Oracle.  The ATG Dynamo product was an impressive solution as it included a persistence layer and a scenarios module.  At the time, companies like Target and Best Buy used the Dynamo solution, leveraging the scenario module to provide recommendations to the customer.

As an example, the Dynamo eCommerce solution was smart enough to remember when a customer added a product to their cart and later removed it.  As an incentive, the scenario server could be designed to offer the same customer a modest discount on a future visit if they re-added the product into their cart and purchased it within the next 24 hours.

Since those days, I have always wanted to create a simple recommendations engine, which is my goal for this publication.

About the Recommendations Example

I wanted to keep things simple and create some basic domain objects for the recommendations engine.  In this example, the solution will make recommendations for musical artists and the underlying Artist object is quite simple:

@AllArgsConstructor
@NoArgsConstructor
@Data
public class Artist {
   private String name;
}

In a real system, there would be so many more attributes to track. However, in this example, the name of the artist will suffice.

As one might expect, customers will rate artists on a scale of 1 to 5, where a value of five represents the best score possible.  Of course, it is possible (and expected) that customers will not rate every artist.  The customer will be represented (again) by a very simple Customer object:

@AllArgsConstructor
@NoArgsConstructor
@Data
public class Customer {
   private String username;
}

The concept of a customer rating an artist will be captured in the following Rating object:

@AllArgsConstructor
@NoArgsConstructor
@Data
public class Rating {
   private String id;
   private double score;
   private Customer by;
   private Artist about;
}

In my normal Java programming efforts I would likely use private Customer customer and private Artist artist for my objects, but I wanted to follow the pattern employed by graph databases, where I employ variables like `by` and `about` instead..  This should become more clear as the article continues.

Dgraph Slash GraphQL

With the popularity of graph databases, I felt like my exploration of creating a recommendations engine should employ a graph database.  After all, GraphQL has become a popular language for talking to services about graphs.  While I only have some knowledge around graph databases, my analysis seemed to conclude that a graph database is the right choice for this project and is often the source for real-world services making recommendations. 

Graph databases are a great solution when the relationships (edges) between your data (nodes) are just as important as the data itself—and a recommendation engine is the perfect example.  

However, since I'm just starting out with graph databases, I certainly didn't want to worry about starting up a container or running a GraphQL database locally.  Instead I wanted to locate a SaaS provider.  I decided to go with Dgraph's fully-managed backend service, called Slash GraphQL.  It's a hosted, native GraphQL solution. The Slash GraphQL service was just released on September 10th, 2020 and can be enabled by using the following link. The platform offers a free trial which will work for this article (then moves to a $9.99/mo flat fee for up to 5GB data).

https://slash.dgraph.io/

After launching this URL, a new account can be created using the normal authorization services:

In my example, I created a backend called "spring-boot-demo" which ultimately resulted in the following dashboard:

The process to get started was quick and free, making it effortless to configure the Slash GraphQL service.

Configuring Slash GraphQL

As with any database solution, we must write a schema and deploy it to the database.  With Slash GraphQL, this was quick and easy:

type Artist {
   name: String! @id @search(by: [hash, regexp])
   ratings: [Rating] @hasInverse(field: about)
}
type Customer {
   username: String! @id @search(by: [hash, regexp])
   ratings: [Rating] @hasInverse(field: by)
}
type Rating {
   id: ID!
   about: Artist!
   by: Customer!
   score: Int @search
}

In fact, my original design was far more complex than it needed to be, and the level of effort behind making revisions was far easier than I expected.  I quickly began to see the value as a developer to be able to alter the schema without much effort.

With the schema in place, I was able to quickly populate some basic Artist information:

mutation {
 addArtist(input: [
   {name: "Eric Johnson"},
   {name: "Genesis"},
   {name: "Led Zeppelin"},
   {name: "Rush"},
   {name: "Triumph"},
   {name: "Van Halen"},
   {name: "Yes"}]) {
   artist {
     name
   }
 }
}

At the same time, I added a few fictional Customer records:

mutation {
 addCustomer(input: [
   {username: "Doug"},
   {username: "Jeff"},
   {username: "John"},
   {username: "Russell"},
   {username: "Stacy"}]) {
   customer {
     username
   }
 }
}

As a result, the five customers will provide ratings for the seven artists, using the following table:

An example of the rating process is shown below:

mutation {
 addRating(input: [{
   by: {username: "Jeff"},
   about: { name: "Triumph"},
   score: 4}])
 {
   rating {
     score
     by { username }
     about { name }
   }
 }
}

With the Slash GraphQL data configured and running, I can now switch gears and work on the Spring Boot service.

The Slope One Ratings Algorithm

In 2005, a research paper by Daniel Lemire and Anna Maclachian introduced the Slope One family of collaborative filtering algorithms.  This simple form of item-based collaborative filtering looked to be a perfect fit for a recommendations service, because it takes into account ratings by other customers in order to score items not rated for a given customer.

In pseudo-code, the Recommendations Service would achieve the following objectives:

  • retrieve the ratings available for all artists (by all customers)
  • create a Map<Customer, Map<Artist, Double>> from the data, which is a customer map, containing all the artists and their ratings
  • the rating score of 1 to 5 will be converted to a simple range between 0.2 (worst rating of 1) and 1.0 (best rating of 5).

With the customer map created, the core of the Slope One ratings processing will execute by calling the SlopeOne class:

  • populate a Map<Artist, Map<Artist, Double>> used to track differences in ratings from one customer to another
  • populate a Map<Artist, Map<Artist, Integer>> used to track the frequency of similar ratings
  • use the existing maps to create a Map<Customer, HashMap<Artist, Double>> which contain projected ratings for items not rated for a given customer

For this example, a random Customer is selected and the corresponding object from the Map<Customer, HashMap<Artist, Double>> projectedData map is analyzed to return the following results:

{
   "matchedCustomer": {
       "username": "Russell"
   },
   "recommendationsMap": {
       "Artist(name=Eric Johnson)": 0.7765842245950264,
       "Artist(name=Yes)": 0.7661904474477843,
       "Artist(name=Triumph)": 0.7518039724158979,
       "Artist(name=Van Halen)": 0.7635436007978691
   },
   "ratingsMap": {
       "Artist(name=Genesis)": 0.4,
       "Artist(name=Led Zeppelin)": 1.0,
       "Artist(name=Rush)": 0.6
   },
   "resultsMap": {
       "Artist(name=Eric Johnson)": 0.7765842245950264,
       "Artist(name=Genesis)": 0.4,
       "Artist(name=Yes)": 0.7661904474477843,
       "Artist(name=Led Zeppelin)": 1.0,
       "Artist(name=Rush)": 0.6,
       "Artist(name=Triumph)": 0.7518039724158979,
       "Artist(name=Van Halen)": 0.7635436007978691
   }
}

In the example above, the "Russell" user was randomly selected.  When looking at the original table (above), Russell only provided ratings for Genesis, Led Zeppelin, and Rush.  The only artist that he truly admired was Led Zeppelin.  This information is included in the ratingsMap object and also in the resultsMap object.

The resultsMap object includes projected ratings for the other four artists: Eric Johnson, Yes, Triumph, and Van Halen.  To make things easier, there is a recommendationsMap included in the payload, which includes only the artists that were not rated by Russell.

Based upon the other reviews, the Recommendations Service would slightly favor Eric Johnson over the other four items—with a score of 0.78, which is nearly a value of four in the five-point rating system.

The Recommendations Service

In order to use the Recommendations Service, the Spring Boot server simply needs to be running and configured to connect to the Slash GraphQL cloud-based instance.  The GraphQL Endpoint on the Slash GraphQL Dashboard can be specified in the application.yml as slash-graph-ql.hostname or via passing in the value via the ${SLASH_GRAPH_QL_HOSTNAME} environment variable.

The basic recommendations engine can be called using the following RESTful URI:

GET - {spring-boot-service-host-name}/recommend

This action is configured by the RecommendationsController, as shown below:

@GetMapping(value = "/recommend")
public ResponseEntity<Recommendation> recommend() 
   try {
       return new ResponseEntity<>(recommendationService.recommend(), HttpStatus.OK);
   } catch (Exception e) 
       return new ResponseEntity<>(HttpStatus.BAD_REQUEST);
   }
}

Which calls the RecommendationService:

@Slf4j
@RequiredArgsConstructor
@Service
public class RecommendationService {
   private final ArtistService artistService;
   private final CustomerService customerService;
   private final SlashGraphQlProperties slashGraphQlProperties
   private static final String RATING_QUERY = "query RatingQuery { queryRating { id, score, by { username }, about { name } } }";
   public Recommendation recommend() throws Exception {
       ResponseEntity<String> responseEntity = RestTemplateUtils.query(slashGraphQlProperties.getHostname(), RATING_QUERY);
      try {
           ObjectMapper objectMapper = new ObjectMapper();
           SlashGraphQlResultRating slashGraphQlResult = objectMapper.readValue(responseEntity.getBody(), SlashGraphQlResultRating.class);
           log.debug("slashGraphQlResult={}", slashGraphQlResult);
           return makeRecommendation(slashGraphQlResult.getData());
       } catch (JsonProcessingException e) {
           throw new Exception("An error was encountered processing responseEntity=" + responseEntity.getBody(), e);
       }
   }
...
}

Please note - something that might be missed at a quick glance of this code, is the power and ease in being able to pull out a subgraph to perform the recommendation.  In the example above, the slashGraphQlResult.getData() line is providing a subgraph to the makeRecommendation() method.

The RATING_QUERY is the expected Slash GraphQL format to retrieve Rating objects.  The RestTemplateUtils.query() method is part of a static utility class, to keep things DRY (don't repeat yourself):

public final class RestTemplateUtils {
   private RestTemplateUtils() { }
   private static final String MEDIA_TYPE_GRAPH_QL = "application/graphql";
   private static final String GRAPH_QL_URI = "/graphql";
   public static ResponseEntity<String> query(String hostname, String query) {
       RestTemplate restTemplate = new RestTemplate();
       HttpHeaders headers = new HttpHeaders();
       headers.setContentType(MediaType.valueOf(MEDIA_TYPE_GRAPH_QL));
       HttpEntity<String> httpEntity = new HttpEntity<>(query, headers);
       return restTemplate.exchange(hostname + GRAPH_QL_URI, HttpMethod.POST, httpEntity, String.class);
   }
}

Once the slashGraphQlResult object is retrieved, the makeRecommendation() private method is called, which returns the following Recommendation object. (This was shown above in JSON format):

@AllArgsConstructor
@NoArgsConstructor
@Data
public class Recommendation {
   private Customer matchedCustomer;
   private HashMap<Artist, Double> recommendationsMap;
   private HashMap<Artist, Double> ratingsMap;
   private HashMap<Artist, Double> resultsMap;
}

Conclusion

In this article, an instance of Dgraph Slash GraphQL was created with a new schema and sample data was loaded.  That data was then utilized by a Spring boot service which served as a basic recommendations engine.  For those interested in the full source code, please review the GitLab repository.

From a cost perspective, I am quite impressed with the structure that Slash GraphQL provides.  The new account screen indicated that I have 10,000 credits to use, per month, for no charge.  In the entire time I used Slash GraphQL to prototype and create everything for this article, I only utilized 292 credits.  For those needing more than 10,000 credits, $45 (USD) per month allows for 100,000 credits and $99 (USD) per month allows for 250,000 credits.

Using a graph database for the first time did present a small learning curve and I am certain there is far more than I can learn by continuing this exercise.  I felt that Slash GraphQL exceeded my expectations with the ability to change the schema as I learned more about my needs.  As a feature developer, this is a very important aspect that should be recognized, especially compared to the same scenario with a traditional database.

In my next article, I'll introduce an Angular (or React) client into this process, which will interface directly with GraphQL and the Recommendation Service running in Spring Boot.

(Published with the kind permission of the original author, John Vester)