2009年8月7日星期五

Ophone开发者网站忘记了android

Ophone是中国移动“自主研发”的手机操作系统,最近刚公布了开发者网站http://www.ophonesdn.com

为什么自主研发四个字上面有引号呢?这个ophone实际上是从Google的Android手机操作系统上修改和二次开发而来的,并非原创。完全开放源代码的Android系统给ophone打下了一个重要的基础,从设计结构上来说,ophone只是android的一个修改版。站在巨人的肩膀上是一个好事情,但是站都站了,却连师傅的名字都不肯提,偏要摆出一副自主创新开天辟地的架势,这就不好了。ophone完全可以说自己是“基于Google Android系统二次开发,符合中国移动的具体需求”,相信也不会给自己丢脸。

抛开这些事情不说,ophone还是有点前途的,前提是中国移动能够搞出合理的上网资费套餐,并且经营好应用程序商店(Mobile Market)。用户用的起,敢用了,就会带动应用程序的销售;应用程序真正有销量,就会激励开发者绞尽脑汁地制作更好的程序;有更多好的应用程序,方便了用户的生活,用户就会更依赖3G网络。这个共赢的环节逐个打通,中国移动的发财日子就不远了。

最后,我也注册了一个ophone开发者帐号,有空体验一下 :)

p.s. 中国移动怎么就把mac版本的sdk生生阉割掉了呢?tnn的。

2009年8月5日星期三

Packing all-in-one JAR for Hadoop (HadoopJar)

Hadoop allows us to pack our code into a jar file and run it with "hadoop jar mycode.jar". However, if our code depends on other jars (most non-trival codes do), distributing those depdendency jars becomes a problem.

Here I introduce one approach used by myself, which packs our own code and dependency jars into one jar. This all-in-one jar can be run using "hadoop jar my-all-in-one.jar", and all dependencies will work with no problem. I name this solution as HadoopJar.

I use an Ant task to do this. The XML code snippet is as follows,
<target name="hadoop-jar" depends="compile" description="Create binary distribution">
<!-- Firstly, copy all dependency jars into build/lib, while build is the root folder for the future jar. -->
<copy todir="${path.build.classes}/lib">
<fileset dir="lib">
<include name="**/*.jar">
<!-- We exclude hadoop-*-core.jar because it's already in the hadoop classpath -->
<exclude name="**/hadoop-*-core.jar">
</exclude>
</include>

<!-- Combine all dependency jars' names to a string, which can be used as a CLASSPATH value -->
<pathconvert property="hadoop-jar.classpath" pathsep=" ">
<regexpmapper from="^(.*)/lib/(.*\.jar)$" to="lib/\2" />
<path>
<fileset dir="${path.build.classes}/lib">
<include name="**/*.jar" />
</fileset>
</path>

<!-- Generate a manifest file contains the previous made CLASSPATH string -->
<manifest file="MANIFEST.MF">
<attribute name="Class-Path" value="${hadoop-jar.classpath}" />
<!-- Set a default entry point -->
<attribute name="Main-Class" value="org.nogroup.Main" />
</manifest>

<!-- Pack everything into one HadoopJar -->
<jar basedir="${path.build.classes}" manifest="MANIFEST.MF" jarfile="${path.build}/learning-hadoop.jar">
<include name="**/*.class">
<include name="**/*.jar">
</include>

<!-- Delete the manifest file -->
<delete dir="${path.build.classes}/lib" />
<delete file="MANIFEST.MF" />

</target>


We are done :). I tried this on our hadoop-0.15.0 cluster with 6 machines, and it also works in higher version of Hadoop, including hadoop in local-mode.

Hope this helps.

p.s. Dependency jars, also called third-party jars, third-party library.
I also wrote a Chinese version at here.

2009年7月11日星期六

我不远万里翻过伟大的墙来说句话

真是不容易啊,好久都没登上自己的博客了。我遵纪守法,从不谈论那啥那啥,咋就稀里糊涂地被绿伟大的坝墙了呢?

2009年5月18日星期一

Job Opening: Google Beijing Research

Job Title

Senior Research Scientist

Job Description

Google Beijing Research is looking for two highly motivated senior scientists to research and develop distributed machine learning algorithms for Web-scale applications such as search relevance, classification, recommendations, advertisement, and user research. Beijing Research has been collaborating with scientists and research interns from top universities world-wide including MIT, UC, Tsinghua, PKU, and etc. The position will be at Google Beijing office, which is located at Tsinghua Science Park. Among several, a couple OpenSource algorithms developed by Beijing Research are available at:

· Parallel SVMs (PSVM);

· Parallel LDA (PLDA)

Qualifications:


1. PhD in Compute Science, Mathematics, Statistics, or related areas with 3-5 years experience.

2. Strong publication track record at top refereed conferences.

3. Strong motivation to work with large-scale real-life data mining applications in team environment.
4. Perseverance, focus, integrity, ability to make things done, leadership in organization and partnership and communication with teams in the US.
5. Proven knowledge in Machine Learning, Data Mining, Natural Language Processing, Numeric Analysis, and/or Optimization.


有兴趣的联系我,哈哈

2009年5月12日星期二