vendredi 17 avril 2015

Social Networking Using OQGraph

I was given the chance to experiment typical social networking query on an existing 60 Millions edges dataset

How You're Connected


Such algorithms and others are simply hardcoded into the OQGraph. 

With the upgrade of OQGraph V3 into MariaDB 10 we can proceed directly on top of the exiting tables holding the edges kine of featured VIRTUAL VIEW. 



CREATE OR REPLACE TABLE `relations` (
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `id1` int(10) unsigned NOT NULL,
  `id2` int(10) unsigned NOT NULL,
  `relation_type` tinyint(3) unsigned DEFAULT NULL,
  KEY `id1` (`id1`),
  KEY `id2` (`id2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

oqgraph=# select count(*) from relations;

+----------+
| count(*) |
+----------+
| 59479722 |
+----------+
1 row in set (23.05 sec)

Very nice integration of table discovery that save me referring to documentation to found out all columns definition.  

CREATE TABLE `oq_graph`
ENGINE=OQGRAPH `data_table`='relations' `origid`='id1' `destid`='id2';

oqgraph=# SELECT * FROM oq_graph WHERE latch='breadth_first' AND origid=175135 AND destid=7;
+---------------+--------+--------+--------+------+--------+
| latch         | origid | destid | weight | seq  | linkid |
+---------------+--------+--------+--------+------+--------+
| breadth_first | 175135 |      7 |   NULL |    0 | 175135 |
| breadth_first | 175135 |      7 |      1 |    1 |      7 |
+---------------+--------+--------+--------+------+--------+
2 rows in set (0.00 sec)


oqgraph=# SELECT * FROM oq_graph WHERE latch='breadth_first' AND origid=175135 AND destid=5615775;
+---------------+--------+---------+--------+------+----------+
| latch         | origid | destid  | weight | seq  | linkid   |
+---------------+--------+---------+--------+------+----------+
| breadth_first | 175135 | 5615775 |   NULL |    0 |   175135 |
| breadth_first | 175135 | 5615775 |      1 |    1 |        7 |
| breadth_first | 175135 | 5615775 |      1 |    2 | 13553091 |
| breadth_first | 175135 | 5615775 |      1 |    3 |  1440976 |
| breadth_first | 175135 | 5615775 |      1 |    4 |  5615775 |
+---------------+--------+---------+--------+------+----------+
5 rows in set (0.44 sec)

What we first highlight is that underlying table indexes KEY `id1` (`id1`), KEY `id2` (`id2`) are used by OQgrah to navigate the vertices via a number of key reads and range scans, such 5 level relation was around 2689 jump and 77526  range access to the table . 

Meaning the death of the graph was around 2500 with an average of 30 edges per vertex 

# MyISAM

oqgraph=# SELECT * FROM oq_graph_myisam WHERE latch='breadth_first' AND origid=175135 AND destid=5615775;
+---------------+--------+---------+--------+------+----------+
| latch         | origid | destid  | weight | seq  | linkid   |
+---------------+--------+---------+--------+------+----------+
| breadth_first | 175135 | 5615775 |   NULL |    0 |   175135 |
| breadth_first | 175135 | 5615775 |      1 |    1 |        7 |
| breadth_first | 175135 | 5615775 |      1 |    2 | 13553091 |
| breadth_first | 175135 | 5615775 |      1 |    3 |  1440976 |
| breadth_first | 175135 | 5615775 |      1 |    4 |  5615775 |
+---------------+--------+---------+--------+------+----------+
5 rows in set (0.11 sec)

Need to investigate more such speed difference using MyISAM. Ideas are welcome ?

Aucun commentaire:

Enregistrer un commentaire