Menu

Creating a recommendation system

Help
2010-02-25
2012-09-14
  • Nobody/Anonymous

    Hi, I'd like to create a raccomandation system which will use transduction
    reasoning...is it possible using waffles? By the way, where can I find a good
    documentation about transductive recommendation system ?

     
  • Nobody/Anonymous

    Waffles can do transduction. It won't help with collecting data, interfacing
    with a database, building a web site, attracting customers, e-commerce, or any
    of those other things you typically find in such a system.

    Here's how I'd approach it: Start by collecting a lot of data. (The more the
    better. People are complex creatures. You can't expect to accurately predict
    them if you only have a couple thousand training samples.) Next, tinker with
    Waffles (or Weka, or any other machine learning package) to figure out which
    model and features give good predictions. Finally, integrate the best
    predictive model with your system.

    I have never looked into it, but I suspect that you will find a dearth of good
    documentation about which features are useful for making good predictions
    about people's purchasing preferences. Business-type-people are likely to
    hoard such information because it makes them money, and academic-type-people
    aren't likely to think such info is worthy of publishing.

     
  • Nobody/Anonymous

    ...if you already have loads of data, and you're trying to enter something
    akin to the Netflix competition, then that's a little bit different. The first
    problem you'll probably encounter is that there is more data than you can load
    into your computer's memory.

    If your data comes from unstructured text, or other sparse forms of data,
    Waffles enables you to encode your data as a sparse matrix and train without
    ever actually expanding the matrix. This allows you to use huge tables of data
    without filling up your memory. Unfortunately, only the Naive Bayes and Neural
    Net models have been tested with sparse-matrix training. If one of those two
    choices meets your needs, then you're all set.

    If your dataset is too big, it's usually a good idea to sub-sample, and then
    use PCA to reduce the dimensionality of your data to create a dataset of
    reasonable size. Then, you can try cross-validation with lots of different
    models and different parameters without having to wait for days for each
    experiment to complete. After you find one that works reasonably well on the
    smallish dataset, then try it on the large one.

     
  • Mike Gashler

    Mike Gashler - 2010-11-05

    If you are still interested, I added a demo recommendation system to the
    latest release. You might find it to be helpful.

     

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.