Archive for category Technology

Guest Blogger Brad Zobrist – How he would implement a Bitcasa

First I think it’s good to clarify what I understand Bitcasa is trying to do and if I was their architect how I would do it.

This of course is all guessing and speculation.

They are trying to store all clients data for ever and cache all recently accessed data locally & predictively cache other data.  All while expiring from the cache old infrequently accessed data.

Features they have or will be implementing are:

  • All data is backed up to a cloud. All data is deduplicated with all other client data
  • All data is NOT accessible by any other client or them
  • Files and / or folders can be shared seamlessly with other users.
  • Local hard drive space is used as a cache by predictive algorithms

Here is how I would implement each one of these to create an overall architecture similar to what I think Bitcasa is providing.  At a high level I would follow the process of deduplicating all data at the block level across ALL clients, have each client encrypt (with client’s key) the reference to what blocks that make up their files and folders, store those encrypted references in a users file database or key->value store, compress and encrypt (with Bitcasa’s master key) new unknown blocks, then write those blocks back to the Bitcasa cloud storage.  Bitcasa would retain a master hash table of all known blocks and each clint would send a list of it’s blocks and then Bitcasa master block hash table would respond and tell the client to only send new, unknown, blocks back to the storage compressed and encrypted.

So breaking it down here is how the address each feature:

  • All data is backed up to a cloud & All data is deduplicated with all other client data.

All un-encrypted blocks (not files) are hashed and those hashes are sent back to the master hash table at Bitcasa and then you get a list of what “new” blocks need to be backed up and duplicate blocks can be thrown away.

  • All data is NOT accessible by any other client.

The next level is to take the blocks / hashes that make up a file and create an encrypted user client key hash of what blocks make up that hash. You may even need to take the block level down to a sector level to get away files that could fit in a single block or segment size.

  • Files and / or folders can be shared seamlessly with other users.

Because file / folder references are only stored with an encrypted reference to what blocks make up those files you simple need to give the new client the list of those blocks.

  • Local hard drive space is used as a cache by predictive algorithms

An additional piece of code is monitoring what files / hashes / blocks are being accessed and knows if they’re cached locally or need to be pulled remotely.  I believe the predictive part is where most of their patents are but unfortunately we won’t be able to find out about them for ~18 months, read Bitcasa gets an early start on IP acquisition.

Well, that’s what I think?

Thoughts?

, , , , , , ,

2 Comments

Snowed in? 15 Thinks you should be using online CCOD – 9.6.2011

There are a ton of cool things to do on the Internet. New doors are open to everyone. I’m surprised how often we take it for granted that everyone is in on the latest trend in tech. Here is my humble addition to a list of things that I think people should be using online.

 

1. Twitter – News *stream*, or should I say FLOOD. Follow smart people, get smart (filtered) news and info. Want to blow your news mind? Get tweetdeck and put in a search for any hot topic. (Don’t follow #earthquake unless you want to feel constant fear).

2. Facebook – Connect with your family and friends. Be benign on Facebook! The Internet is public, immortal, and Facebook does hate your privacy.

3. Amazon AWS/EC2 - What you don’t need a virtual server? You sure about that? Not for your blog? Not even if it scales infinitely? Not even if it’s free?

4. WordPressJoomla and Drupal are cool, but WordPress is the king of the web page CMS.

5. Gmail – Seriously, stop deleting your email, get a gmail account. Use your own domains (Google Apps is still free for < 10 users).

6. Google Docs – If you haven’t had 10 people all editing the same spreadsheet at the same time you have not Cloud’d it up.

7. Cloud Music (Google Music, Amazon Music Locker, iCloud, Soundcloud, Spotify) – This is new, try them all out, find new music, sync your own.

8. Google – Search done right. Everyone has been playing catchup for a while now, and I’m sure that one day they will, but until then, google.com

9. Snopes – The Internet means rapid access to information sharing, but many people share false information. Sites like snopes.com

10. Shopping - Deal sites like slickdeals.net, fatwallet.com, woot and more track deals as they happen, often with good comments on how to maximize them. The people on some of these sites are mad geniuses when it comes to getting the most for your buck.

11. Skype – Everyone has it, get on and video chat your friends in other countries for free. Ride this one until Microsoft torpedos it, and we all move to Google Chat, which you should be on already via your gmail account.

12. Linux – If you are even slightly technically inclined, Linux opens the door to you (for free) to everything from high end movie effects to  computer forensics. Get started with a Live CD from Ubuntu (Your computer is probably 64-bit, and you probably want the desktop version – You can boot the CD and use Linux without doing anything to your computer), and NO it does not run Office or any Windows program, but it does run thousands of cool programs.

13. Photo sites (Picasa, Flickr, Smugmug) There is no reason you should be burning a photo CD to send to your friends and family. Get an upload utility, and start putting your photos on the ‘net. You don’t have to share them, and I would highly recommend NOT sharing them publicly unless they are very public information. I do not post pictures with faces in them without permission from the person owning the face, and, in general, don’t do this.

14. Education (Khanadademy, Alison.com, MIT Open Courseware, Instructables, k12) – There are too many to name, and pretty much access to infinite information is it’s own education. Don’t think that just because a skill isn’t directly computer related that you can’t learn howto do it online, and for maybe for free.

15. Wikipedia - What is a wikipedia? Well, a wiki is a website that anyone can edit the pages of, so, Wikipedia is an encyclopedia that anyone can edit. Not always right, but rarely uninformative.

 

Well, I hope this helps. Please send me your lists or additions (comment below, or email to jon@jonzobrist.com).

 

, , , , , ,

No Comments

Thinking about hosting a WordPress site on S3

I recently moved my Joomla backed consulting website completely to Amazon S3, and have been very happy with the results. I would like to do something similar for my personal blog site at jonzobrist.com, however I would like it to be more dynamic, or at least easily update-able.

For my Joomla site, I did a complete mirror to static html and then uploaded all of that to S3, in a bucket with the same name as the site’s (www.bluesun.net), and changed DNS to point to the CNAME for that bucket’s HTTP address. This involved running wget -r -k -E -p -U Mozilla http://www.bluesun.net, editing the files wget copied to all point at the right places for things like menus, etc, and then uploading the files to Amazon S3.

My goal here is to recreate that in a more automated way, so that I can have a main site that is dynamic, but most, if not all, of the content is served from a static repository on S3. The expected outcome I think will be to take a site that costs around $15-20/month and make it cost < $1 /month. And, if I get some huge surge of traffic, to handle the load gracefully, and scale into the many terabytes of serving up data affordably.

A few quick thoughts/notes;

First, if you don’t change permission on newly uploaded items on S3 they default to your default, which is usually no public access. However, if you upload a new version of a file, it keeps the permissions the previous version had.

Second, you cannot host a naked domain (in this case http://bluesun.net) on Amazon S3. This is more a limitation of the the standards that say you shouldn’t. It means that you need something to redirect your naked domain to your web server. A lot of people don’t do this at all, but I think it’s a good thing to do. I think the details of this limitation will actually come in handy in my hybrid dynamic/static WordPress site.

Third, it makes a lot of sense to compress objects, and setting the right headers on the object will, I believe, get S3 to automatically server it up in a way a browser can understand. Most of the things that make up web pages (HTML and javascript) are text based and compress very well. On the other side images used on the web are generally already very compressed.

Fourth, having a hybrid site means you will still have some dynamic objects and this will mean manually processing (or manually setting up automated processing) html files to separate dynamic from static content.

Fifth, I’m a huge fan of things like Google Analytics, which are hosted by Google, and only included in my site as a static snippet of code that pulls more code direct from their servers. I would love to have something similar for comments and other user generated content that messes up the static website paradigm. I think technologies like AJAX can really shine here.

Brief background, my site (jonzobrist.com) is a standard WordPress install, currently running on an EC2 Micro Instance running Ubuntu 10.04 with Apache/PHP/MySQL all running on one machine. It’s an EBS backed instance, and I snapshot the root volume. I don’t really make updates more than once or twice a week, and none of my content needs to be pushed live in any kind of urgent manner. That said, I use WP to Twitter to auto tweet new posts, so I need to be able to force an update, or handle not having new content on S3 gracefully. I don’t get a particularly large number of visitors, lately about 1,000 a month. My main motivation for doing this is to see if it can be done, so I can do it for other sites I support.

Here is a graphical representation of what I think it will look like when done.

Diagram of a static WordPress site on Amazon S3

Then I just need to push all the very static content to Cloud Front for CDN!

What do you think?

, , , , , , , , ,

No Comments

Free QR Code Generator, via Gina Tripani @ smarterware.org

Thanks to Gina Tripani at SmarterWare I found a cool, free, 2D barcode generator!

Her article about 2D barcodes is here

And the QR code generator is here

And, here is my URL QR Code, it generated!

, , , ,

No Comments

Automatic WordPress Database Backups Are Awesome

I recently re-posted a link to 15 Incredible WordPress Plugins but wanted to note that I have been enjoying one in particular.

The WordPress DB Manager (WP-DBManager) is awesome at scheduling and executing backups or your WordPress database.

The database is where all of your valuable data is, with the only exception of custom graphics you have uploaded.

Restoring a WordPress site is easy if you have the database!

It will even send you a scheduled e-mail with the backup file attached.

, , , ,

No Comments

15 Incredible WordPress Plugins

I love WordPress, and I am constantly impressed by the quality and functionality in both WordPress and the many plugins that are out there. I just found this awesome list of WordPress Plugins at sitesketch101.com.

http://www.sitesketch101.com/15-incredible-wordpress-plugins-you-need

Thanks Nicholas Cardot @ Site Sketch 101!

, , ,

1 Comment

Finally On EC2

I can’t believe it took me this long to nut-up and just do it. GoDaddy’s shared hosting “premium plans” are a joke in speed compared to a free micro instance on EC2. I did have an overage charge of $.01 last month. I expect that to go up as I start doing daily snapshots, S3 MySQL binlog syncs, and maybe some actual traffic. Who knows, I may end up owing DOLLARS each month for awesome performance in the real cloud.

DSC_4012.JPG

, , , ,

2 Comments

iPad tips – 4.11.10

I’ve had an iPad for 2 days now, and have to say I’m very happy with it. The things I thought would be deal killers seem to be well thought out design choices (cheesy apple rhetoric?).
Here are some things that I think will help me get the mod out of it. Note, I AM typing this on the iPad, vertical thumb-pecking.

. Use Safari. Safari is nowhere near the browser Firefox is. But, we’re stuck with it on the iPad. But thats not all bad news. Many applications you’re used to using as custom apps on your iPhone actually will be better experienced in Safari on your iPad. This goes for news and blog readers the most so far. iPad specific versions will come out, and they may be superior to the web version. But, for now, many things we’ve come to rely on special apps for are just better directly in Safari.

. Embrace the cloud. I’m assuming you’ve already got email, and calendars online and setup on your iPad, but that’s only the beginning. Bring massive shared file access to your iPad with Dropbox. I use this on 3 computers to keep all my frequently uses files backed up, and in sync. And with their free app, I can access all of my files on my iPhone and iPad. Display of basic files, office documents works from within the app, including spreadsheet viewing. Use all of google’s tools, either their full versions, or their mobile versions at m.google.com.

. Buy apps! I know this sounds like more Apple propaganda, but you just dropped at least half a grand on this thing, why not spend a few more bucks to make it even better? I love flight controls updated ipad version. I’ve sprung for – Elements XD (Very cool, but $13, about $11 too much), Racing HD ($10, nice show off, but I’m not into racing games and can’t play it well), Air Video ($3, streams videos from desktop to iPad/iPhone), Artstudio ($3 easy painting). My main complaint here is it’s not obvious if you can get a refund for apps, so i try to read reviews a lot and search the web for info before buying. And why not load up on free apps? I’m using Google’s app, Siri search assistant, Heyway, AIM, and Skype. All carried over from my iPhone, and have added several iPad specific ones that I’m trying out – iBooks, abc player, netflix (streams video well to the iPad, this is HUGE), Associated Press, The Weather Channel Max, Google Earth, Allrecipies, AroundMe, Bloomberg, Adobe Ideas.

. Make bookmarks on your home screen in Safari. I’ve used this for brizzly.com, since their page from an iPad outshines their iPhone app. It’s just like having an app, with the added bonus that it takes me into my already active Safari session. This extends using the cloud to make web pages seem like apps.

. Keep a screen clowning cloth in your bag. This screen is BIG, and SHINY! It gets dirty and smudged ALL the time. A soft screen cleaning cloth handy does more than just keep my screen clean, it relieves the anxiety of letting people “see” my new toy, which is always done with their hands. Favorite thing cleaned off my screen so far was a very dry booger after my kids were playing on it for a few hours. (seriously, WTF? How did they miss that?). I also get my cloth damp before a serious clean.

. Make sure you have soft, shelf like places to set your iPad in places you frequent. For me this means a space on the desk, a spot on the counter in the kitchen, and one on a shelf in the closet, which seems to be my transient storage location.

. Practice. Typing, locating and using things like search, mail, calendar, and settings. Think for a moment what you use most, and spend a minute just practicing doing those things repeatedly. Saving a few seconds on a task you do a few thousand times really adds up, but, more importantly, it increases the usefulness of all the new tools on your iPad. I plan to practice touch typing, checking my calendar, searching my device, searching the net.

. Fix that ailing wifi coverage. If you have an iPad now, you only have wifi networking. I’m hopeful this is enough for me, but it does require I some attention to my wifi networks. I’m setting up another access point, maybe two, around the house, and will be tuning up the ones at work.

. Put your own movies on it. If you’re not already ripping your DVD collection, download handbrake and get started today. Rip in handbrake, import to your iPad via iTunes, or stream via Air Video. Video really shows off on this thing. Only downside is the audio quality stinks, and you really need headphones to watch a full movie. This is the future of kids car trip entertainment.

Well, that’s it for now, I hope it helps you have a better iPad experience.

,

1 Comment

Easy AdSense by Unreal