Playing with Googles Cloud Vision API
The age of intelligent api’s is here.
If you want to get a taste of what Google’s new “Cloud Vision API” can do. Sign up here https://cloud.google.com/vision/.
Once you’re accepted, get an api key from the Google console
Then at your fingertips you have:
- Optical Character Recognition
- Face detection, and sentiment analysis i.e Happy, Sad
- Landmark detection
I wrote a tiny little ruby script if you want to give a whirl. This specifically uses the landmark detection api.
require 'base64'
require 'json'
require 'faraday'
filepath = '/path/to/my/image.jpg'
content = Base64.encode64(File.binread(filepath))
conn = Faraday.new(:url => 'https://vision.googleapis.com')
data = {requests: [{'image' => {'content' => content}, 'features' => [{'type' => 'LANDMARK_DETECTION', 'maxResults' => 10}]}]};
response = conn.post do |req|
req.url '/v1alpha1/images:annotate?key=<your api key>'
req.headers['Content-Type'] = 'application/json'
req.body = data.to_json
end
puts response
So here is a picture I took in 2006, without any geolocation info attached in exif data.
What does the api return?
It gives a confidence score, a description of the landmark and its rough latitude and longitude, and a bunch of other cool stuff I’m yet to understand.
[
[0] {
"landmarkAnnotations" => [
[0] {
"mid" => "/m/01k_5m",
"description" => "Lake Tahoe",
"score" => 0.46974313,
"boundingPoly" => {
"vertices" => [
[0] {
"x" => 1092,
"y" => 555
},
[1] {
"x" => 1273,
"y" => 555
},
[2] {
"x" => 1273,
"y" => 944
},
[3] {
"x" => 1092,
"y" => 944
}
]
},
"locations" => [
[0] {
"latLng" => {
"latitude" => 38.940395,
"longitude" => -119.91884
}
}
]
},
[1] {
"score" => 0.29812887,
"boundingPoly" => {
"vertices" => [
[0] {
"x" => 960,
"y" => 902
},
[1] {
"x" => 1558,
"y" => 902
},
[2] {
"x" => 1558,
"y" => 1073
},
[3] {
"x" => 960,
"y" => 1073
}
]
},
"locations" => [
[0] {
"latLng" => {
"latitude" => 38.943923,
"longitude" => -119.928989
}
}
]
}
]
}
]