So today I sent myself from home a script i wrote to check my rankings on google.com for my home site ListThatAuto.com. Basically it took my keywords list, parsed it into google urls, uses sockets to connect, grabs the content, parses through the content to pull the result links, checks the result links if it has the name of my site in it, and displays that link and position number on the page if it is there. Pretty simple and straight forward.
Here was the problem. When running the script at home, it would time out because each socket call to google took about 1.1 seconds from start to finish for grabbing the content and processing it. This unfortunately limited me to about 28 keywords at a time to check. I knew that I could do better so I did a few home brewed load balace tests, and was able to identify parts of my code that could use subtle but significant improvement. I made these changes one by one and managed to reduce the time to about .8 seconds per keyword. Still I was capped at about 37-39. My personal site has only 41, so i was just a few away. After about another hour of load testing, I came to the conclusion that my code was as efficient as it would get.
Though it was rather inefficient, I still sent it to work. This is where the most amazing breakthrough took place. If you program in php, you have probably used sockets. Well during my tests at home, I found that most of my time per keyword was just communicating with google. I posed this to our dev team (which i am part of) and we brainstormed for a few minutes on this and developed a theory.
We believed that with PHP all a socket does is open a stream, read it from the source, and store it in a buffer for the script to access at anytime. This means that we could essentially send the request for the information immediately after opening the socket, and just move on to the next connection and do the same until all the connections are made, and then slowly come back and clean up by storing and processing the content as we grab it at our own pace.
I re-wrote my class to a single recursive function to grab, process, and display the above described information. This new script, using the idea of just opening the connection and requesting the data and moving to the next one before retrieving it does the entire process in about 3.5 seconds.
Pretty Impressive if you ask me.
loushou.
Thursday, June 26, 2008
Tuesday, June 24, 2008
Search Engine Development
Today I was assigned a project yesterday to basically start plans for a new search engine with a completely unique algorithm for determining page relevance and page rank. This will be attached to at least NewestMLM.com within a few months. We are basically spending several days to come up with an effective game plan to implement such a project, which should in the long run make this whole project easier.
At home, I have a site, ListThatAuto.com, that is still under development, which i have built a few tools for that i believe could be useful in the development of this new search engine. They involve advance url stripping, content storage, and a couple of analysis tools. I should be sending them to my work email in the next couple days once I compile the list.
This is a big task and i am glad to be a big part of it. I feel like this will be one of those projects that allows you to be a part of something much much bigger than yourself and that is what i am looking for. SEO is hitting a frontier that we can only dream of at this point, however if we want to jump in at any point, the soon the better right....?
loushou
At home, I have a site, ListThatAuto.com, that is still under development, which i have built a few tools for that i believe could be useful in the development of this new search engine. They involve advance url stripping, content storage, and a couple of analysis tools. I should be sending them to my work email in the next couple days once I compile the list.
This is a big task and i am glad to be a big part of it. I feel like this will be one of those projects that allows you to be a part of something much much bigger than yourself and that is what i am looking for. SEO is hitting a frontier that we can only dream of at this point, however if we want to jump in at any point, the soon the better right....?
loushou
Thursday, June 19, 2008
moving on to BIGGER not so much better things
After a careful analysis of the new search.jsp file from yesterday's project, i found that every bit of information I was hoping to learn from recreating this script, is all handled in the background by the java applets running on the tomcat connection. Thus all of the work I have done so far is to the avail of nothing more than chasing rabbits...
Agravated, I began to look for the source for this java applet and all it's components, only to find that I need to get subversion from apache in order to get the source, as it is only held in a repository and no packages exist for it. So I downloaded the subversion app and installed it. After a few failed attempts of trying to checkout the current source, I decided to HUNT for the proper syntax to use subversion to get the source. I found the one line command that EVERYONE has executed to get this source before me, only to find that on my computer it does not work.
After a little more searching, I found one other source that i could download the source from. The catch is that i would have to right click each file and do a save as routine on each of about 200 files. Thinking to myself that this is bogus, I decided to start to write a php script to do it for me. that is where i am leaving today and will continue tomorrow.
Loushou.
Agravated, I began to look for the source for this java applet and all it's components, only to find that I need to get subversion from apache in order to get the source, as it is only held in a repository and no packages exist for it. So I downloaded the subversion app and installed it. After a few failed attempts of trying to checkout the current source, I decided to HUNT for the proper syntax to use subversion to get the source. I found the one line command that EVERYONE has executed to get this source before me, only to find that on my computer it does not work.
After a little more searching, I found one other source that i could download the source from. The catch is that i would have to right click each file and do a save as routine on each of about 200 files. Thinking to myself that this is bogus, I decided to start to write a php script to do it for me. that is where i am leaving today and will continue tomorrow.
Loushou.
Wednesday, June 18, 2008
more JSP
So something weird happened when I came in today to continue my woeful travels in the JSP search engine stuff. All was well until I tried to view the search file again. My quanta locked up and force me to restart it. When I did this, the search.jsp file had been truncated to size 0. (for those of you that are not in the know that means essentially all the data in that file went bye-bye).
Thus for the better portion of the day, working on the assumption that a good portion of the code we are using is open source, I have been trying to locate the proper version so that it could be replaced. This wound up not working because apparently we go this code from someone who slightly modified the code so that it would have additional functionality, consequently changing some of the key structure of the file that is now gone. To recreate this from memory would be a feat in it of itself, so we are forced to get it again from the company who supplied it.
Who knows how long that will take. In any event, once we get it back, I should have developed another solution to this problem that is killing all our listings.
loushou.
Thus for the better portion of the day, working on the assumption that a good portion of the code we are using is open source, I have been trying to locate the proper version so that it could be replaced. This wound up not working because apparently we go this code from someone who slightly modified the code so that it would have additional functionality, consequently changing some of the key structure of the file that is now gone. To recreate this from memory would be a feat in it of itself, so we are forced to get it again from the company who supplied it.
Who knows how long that will take. In any event, once we get it back, I should have developed another solution to this problem that is killing all our listings.
loushou.
Tuesday, June 17, 2008
JSP
Well we have some of our software that is run off of JSP. unfortunately it is producing some unexpected results, which are very inconsistent. See we have a spider that does just that with pages on the net. then after they are spidered, they are thrown into a search tool. now we have submitted many sites to the spider and a portion of them are not able to be found when we search. we know that they were spidered. we know that the others are showing up. still they remain gone. Now this JSP script uses some type of DB that is not SQL driven, as far as i can tell. which poses a problem for me.
The problem is that all of the back end interacting with the db is in compiled java. the JSPs are just executing these commands and grabbing the results. so... we are stuck for the moment.
If anyone can figure it out it is me.
loushou.
The problem is that all of the back end interacting with the db is in compiled java. the JSPs are just executing these commands and grabbing the results. so... we are stuck for the moment.
If anyone can figure it out it is me.
loushou.
Friday, June 13, 2008
Ebay....XML.... not a good mix
Well... I really don't know what else to say. XML should be XML, especially if it is structured properly, which this code is. The problem is we through the XML with the XSL template in the XSL processor and magically we do not have certain nodes that can be accessed with the template. If we dump the RAW XML... the nodes are there. In fact, if we do the template function that allows you to get the name of a node and throw it in a template that matches all the child nodes of a node, even after the XSL processor parses it, we can get the names of the nodes that "don't exist". however, as soon as we step into that main node where the information that we need is, it craps on itself.
frustrating, confusing, and inconsistant.
Loushou.*frowning*
frustrating, confusing, and inconsistant.
Loushou.*frowning*
Thursday, June 12, 2008
Learn XML, XSL, and XSLT.... NOW!!!!
This is pretty much the only thing going through my mind right now. I need to actually take the time to learn these because this is my primary weakness. I could probably avoid a large amount of questions if i just master these few markups. I look at it and, because i have not used it before, i just star blankly at it and change small things hoping for results.
This must change. NOW!!!!
Loushou.
This must change. NOW!!!!
Loushou.
Tuesday, June 10, 2008
Cervisia -- the devil and god similutaneously
Well, well, well. CVS is a very powerful tool. I can ensure that good copies of software are kept track of so that if you mess something up you can revert. I also allows the creation of branches for projects that are either separating from the main project or for experimental modules to the large picture. It is extremely flexible in most every regard.
However, on occasion it can do everything but what you want it to. For the better portion of the day (prior to lunch) Steve and I worked with it to get it setup properly. It took this long to not only reteach ourselves how to use it, but to get our front-end (Cervisia) to work with us instead of against us. Consequently we did not spend as much time on our actual work.
This however is not a big issue, looking at the large picture, because this is something that we need in place, and could save us even more time down the line. That being said, though it seemed as though we wasted a lot of time today, it WAS IN FACT a big step in the right direction. I am looking forward to more software implementations that will increase productivity and security of information in the months to come. In that respect the should be a plethora of new software integrations.
Loushou.
However, on occasion it can do everything but what you want it to. For the better portion of the day (prior to lunch) Steve and I worked with it to get it setup properly. It took this long to not only reteach ourselves how to use it, but to get our front-end (Cervisia) to work with us instead of against us. Consequently we did not spend as much time on our actual work.
This however is not a big issue, looking at the large picture, because this is something that we need in place, and could save us even more time down the line. That being said, though it seemed as though we wasted a lot of time today, it WAS IN FACT a big step in the right direction. I am looking forward to more software implementations that will increase productivity and security of information in the months to come. In that respect the should be a plethora of new software integrations.
Loushou.
Monday, June 9, 2008
Love Sharing Knowledge
So out of the blue my C++ experience had it's shining moment today. I was able to share some of my infinite knowledge of C++ with my manager and fellow team-mate (yes singular).
Type casting, a very old (pre-dating C++ all the way to C), syntax was in fact inherited by PHP. You can force a type on any value, even though in PHP types are very loose, so that if a function from some random library requires a certain type, it will not warn you or error out. This is very useful when ensuring that your releases do not contain errors, warnings or notices.
This technique only works for basic types: bool, string, integer, and foat, The following is the syntax:
()$myVariable
ie.: (string)$var
Hope this helps in the future of everyone's programming endeavors.
Loushou.
Type casting, a very old (pre-dating C++ all the way to C), syntax was in fact inherited by PHP. You can force a type on any value, even though in PHP types are very loose, so that if a function from some random library requires a certain type, it will not warn you or error out. This is very useful when ensuring that your releases do not contain errors, warnings or notices.
This technique only works for basic types: bool, string, integer, and foat, The following is the syntax:
(
ie.: (string)$var
Hope this helps in the future of everyone's programming endeavors.
Loushou.
Friday, June 6, 2008
not HAL.... DAL!
Today I worked primarily on the DAL for the entire timberlake template system. This to say the least is a task and a half. However, if anyone can do it, it is me.
The first thing to remember about the DB abstraction layer is why it is there. First and foremost, it is there because let's say that you have all your information stored in a MySQL db, and then you decide to switch to Oracle because you are storing much more data than expected and what a more reliable interface. If you did not have a standardized db abstraction layer, then you would likely be doing every single call independently of one another. What this means is that when you switch, you will need to manually go in and change every single call to open a DB connection, every single call you make to query the db will need to be changed, probably some of the SQL itself will need to be changed. With a DB abstraction layer, you feed the layer information, the layer interpret's it, and then executes the command that you want. With this power, you now only have to change how one portion of your code interacts with the DB instead of every single line.
This hold a reasonable promise that if you ever do change DBs that your code will still work with relatively little modification. Powerful! That being said, they can be complex, depending on the need and on how OCD you are. LOL like me. I want to be able to do everything that you need to do, with options so that you can do it several different ways.
In any event, I think that this may be cut down to be more efficient and less friendly, with the main functions and specific views for standard information that is called regularly. Other than that it should be a little more abstract and a little less specific, but still with the view necessities.
Loushou.
The first thing to remember about the DB abstraction layer is why it is there. First and foremost, it is there because let's say that you have all your information stored in a MySQL db, and then you decide to switch to Oracle because you are storing much more data than expected and what a more reliable interface. If you did not have a standardized db abstraction layer, then you would likely be doing every single call independently of one another. What this means is that when you switch, you will need to manually go in and change every single call to open a DB connection, every single call you make to query the db will need to be changed, probably some of the SQL itself will need to be changed. With a DB abstraction layer, you feed the layer information, the layer interpret's it, and then executes the command that you want. With this power, you now only have to change how one portion of your code interacts with the DB instead of every single line.
This hold a reasonable promise that if you ever do change DBs that your code will still work with relatively little modification. Powerful! That being said, they can be complex, depending on the need and on how OCD you are. LOL like me. I want to be able to do everything that you need to do, with options so that you can do it several different ways.
In any event, I think that this may be cut down to be more efficient and less friendly, with the main functions and specific views for standard information that is called regularly. Other than that it should be a little more abstract and a little less specific, but still with the view necessities.
Loushou.
Thursday, June 5, 2008
Some new... some old
Well my first day in Dev went well. I learned how the API for Timberlake works, for the most part, and because I seem to know what I am talking about, I was assigned to redo the data abstraction layer. This sounds harder than it should be. I any event, today I ported the eBay API to a Timberlake class.
This was far easier than I anticipated, as from first glance at the code that eBay provides to developers, it looks complex and long. However, I was able to condense the main portion of it to into 2 functions, 4 lines each... make your own conclusions from this.
Then I started on my new task for the same project, which was to make a Timberlake module for this eBay API. Consequently, that meant that I needed to learn how to use the Timberlake API. After about 1 hour of studying I got a basic understanding of it and which functions I needed and began to write it. This should be done tomorrow. Anyways....
Loushou
This was far easier than I anticipated, as from first glance at the code that eBay provides to developers, it looks complex and long. However, I was able to condense the main portion of it to into 2 functions, 4 lines each... make your own conclusions from this.
Then I started on my new task for the same project, which was to make a Timberlake module for this eBay API. Consequently, that meant that I needed to learn how to use the Timberlake API. After about 1 hour of studying I got a basic understanding of it and which functions I needed and began to write it. This should be done tomorrow. Anyways....
Loushou
Wednesday, June 4, 2008
Dev... here I come.
Totally, totally, totally. I am totally looking forward to moving over tomorrow. Back-end is my specialty. Programming is my forte, and I love doing it, so it will not be much like work to me. XML and it's family is on the top of my list to learn and I am fully ready to learn.
While Dev is what I want, I do have to admit that learning SEO is very much a good thing. This skill is paramount in building successful websites and a key part of directing traffic to that site. A whole "wrapper" so to speak that now looms over the entire industry of eCommerce. Since this in in fact our main product, I am glad I got the chance to learn so that in the future I will be able to help others if given the opportunity. On top of all of that, I am positive that the learning about SEO will not stop there, as part of dev is knowing how it works and how to implement that into all that we do.
I am thoroughly enjoying this job, and I am glad I was given the opportunity to join such an awesome team.
Loushou.
While Dev is what I want, I do have to admit that learning SEO is very much a good thing. This skill is paramount in building successful websites and a key part of directing traffic to that site. A whole "wrapper" so to speak that now looms over the entire industry of eCommerce. Since this in in fact our main product, I am glad I got the chance to learn so that in the future I will be able to help others if given the opportunity. On top of all of that, I am positive that the learning about SEO will not stop there, as part of dev is knowing how it works and how to implement that into all that we do.
I am thoroughly enjoying this job, and I am glad I was given the opportunity to join such an awesome team.
Loushou.
Tuesday, June 3, 2008
!Stream-lined!
So finally I have this whole process down to a science. It takes about 38 minutes to do all the updating, submitting, uploading, and all that. The big part is the design... I am finding it kind of difficult to constantly come up with a different design for companies that do practically the same thing. For instance MLMs. They all are pitching the same spleel but we need to make a different site for each of them.
I am running out of Ideas... and quick! The tricks I usually use for creating unique designs for sites are not working because of the similarities. In any event, now that I am done pseudo-ranting, I have the process streamlined now and all that takes time is the aesthetics. This part is still taking me hours.
here is my recently competed site: http://www.successwithaninternetbusiness.com/
I am running out of Ideas... and quick! The tricks I usually use for creating unique designs for sites are not working because of the similarities. In any event, now that I am done pseudo-ranting, I have the process streamlined now and all that takes time is the aesthetics. This part is still taking me hours.
here is my recently competed site: http://www.successwithaninternetbusiness.com/
Monday, June 2, 2008
First MLM... WOOT?
I just finished my first MLM. Unless I did something wrong, it really does not seem that different from a regular business's optimization. The only discernible difference is that they require a substantial amount more content then others, and they need the email to be displayed instead of the phone number. Saishu showed me some of the category selections for the engines, PROPS! Here is the link:
http://www.createanewlifestyle.net/
I think on my next page, I am going to change up the design a bit... maybe do something spiffy. LOL. Anyways, back to work.
Loushou.
http://www.createanewlifestyle.net/
I think on my next page, I am going to change up the design a bit... maybe do something spiffy. LOL. Anyways, back to work.
Loushou.
Subscribe to:
Posts (Atom)